An automated web scraping and playlist creation system that discovers new artists from Bandsintown and creates personalized Spotify playlists using advanced API integration and data processing techniques.
This project was born from my love for discovering new music and attending live shows. While Bandsintown is integrated with Spotify, I wanted to create a more intelligent system that could automatically discover artists playing in my area and create personalized playlists based on my musical preferences.
The system scrapes concert data from Bandsintown, matches artists with Spotify's database, enriches the data with genre information from Discogs, and creates curated playlists. It's designed to help music lovers discover their next favorite artist before they become mainstream.
Key Innovation:
The integration of multiple APIs and intelligent filtering based on musical genres and styles, allowing for highly personalized music discovery that goes beyond basic recommendation algorithms.
A multi-layered system combining web automation, API integration, and intelligent data processing
Automated web scraping of Bandsintown using Selenium to gather concert information, artist names, and event dates for specific geographic areas and time periods.
Integration with Spotify API to match scraped artist names with Spotify's database, retrieve artist IDs, and access top tracks for playlist creation.
Discogs API integration to enrich artist data with genre and style information, enabling intelligent filtering based on musical preferences and taste profiles.
Key components and code snippets from the project implementation
The project utilizes a comprehensive set of libraries for web automation, data processing, and API integration:
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from datetime import datetime, timedelta import pandas as pd import time import requests from selenium.webdriver.chrome.options import Options from selenium.webdriver.chrome.service import Service from base64 import b64encode import discogs_client import spotipy from spotipy.oauth2 import SpotifyOAuth import csv
The core scraping functionality uses Selenium to automate browser interactions and extract concert data:
class Scraper:
def __init__(self):
options = Options()
options.add_argument("--headless")
service = Service(executable_path=r"C:\webdrivers\chromedriver.exe")
self.driver = webdriver.Chrome(service=service, options=options)
self.driver.set_page_load_timeout(120)
def scrape_website(self, start_date, end_date):
latitude, longitude = self.get_coordinates('San Francisco')
df = pd.DataFrame(columns=["Band Name", "Date"])
while start_date <= end_date:
formatted_start_date = start_date.strftime('%Y-%m-%dT%H:%M:%S')
formatted_end_date = (start_date + timedelta(days=2)).strftime('%Y-%m-%dT%H:%M:%S')
url = f"https://www.bandsintown.com/choose-dates/genre/all-genres?latitude={latitude}&longitude={longitude}&calendarTrigger=false&date={formatted_start_date}%2C{formatted_end_date}"
self.driver.get(url)
wait = WebDriverWait(self.driver, 10)
wait.until(EC.visibility_of_any_elements_located((By.CLASS_NAME, '_5CQoAbgUFZI3p33kRVk')))
# Extract band and date information
bands = self.driver.find_elements(By.CLASS_NAME, "_5CQoAbgUFZI3p33kRVk")
dates = self.driver.find_elements(By.CLASS_NAME, "r593Wuo4miYix9siDdTP")
for band, date in zip(bands, dates):
band_name = band.text.encode('raw_unicode_escape').decode('utf-8', 'ignore')
concert_date = date.text
# Process and store data...
start_date += timedelta(days=3)
df.to_csv('bands_and_dates.csv', index=False)
Authentication and playlist creation using Spotify's Web API:
# Spotify Authentication
client_id = 'your_spotify_developer_client_id'
client_secret = 'your_spotify_developer_client_secret'
redirect_uri = 'http://localhost:8888/callback'
scope = 'playlist-modify-public'
sp = spotipy.Spotify(auth_manager=SpotifyOAuth(
client_id=client_id,
client_secret=client_secret,
redirect_uri=redirect_uri,
scope=scope
))
current_user = sp.current_user()
user_id = current_user['id']
# Create playlist
playlist_name = 'your_new_playlist_name'
new_playlist = sp.user_playlist_create(user=user_id, name=playlist_name, public=True)
playlist_id = new_playlist['id']
# Add tracks to playlist
for uri in track_uris:
if uri is not None:
try:
sp.playlist_add_items(playlist_id, [uri])
except Exception as e:
print(f"Error adding track to playlist: {e}")
Overcoming technical hurdles and implementing robust solutions
Spotify and Discogs APIs have strict rate limits that could interrupt the data collection process, especially when processing large numbers of artists.
Solution:
Implemented intelligent rate limiting with exponential backoff, batch processing, and error handling to gracefully handle API limitations while maintaining data integrity.
Inconsistent artist naming conventions across platforms made it difficult to accurately match artists between Bandsintown, Spotify, and Discogs.
Solution:
Developed fuzzy matching algorithms with multiple fallback strategies, including partial string matching and similarity scoring to improve match accuracy.
Websites frequently change their structure, making web scraping scripts brittle and prone to breaking without warning.
Solution:
Built a modular scraping framework with robust error handling, automatic retry mechanisms, and monitoring to detect and adapt to website changes.
Creating a system that could learn and adapt to individual musical preferences beyond simple genre classifications.
Solution:
Implemented multi-dimensional filtering based on musical styles, tempo, energy levels, and user feedback to create truly personalized recommendations.
Quantifiable outcomes and real-world impact of the project
Artists Discovered
Playlists Created
Automation Rate
"This system introduced me to so many amazing artists I never would have found otherwise. The automation saves hours of manual searching and the recommendations are spot-on."
- Early Beta User
"The integration of multiple data sources and intelligent filtering creates a music discovery experience that's truly personalized and constantly surprising."
- Music Industry Professional
Planned improvements and expansion of the music discovery system
Expanding beyond Spotify to include other music streaming platforms like Apple Music, YouTube Music, and Tidal for broader accessibility.
Implementing advanced ML models for better music recommendation, including collaborative filtering and content-based recommendation systems.
Developing a user-friendly web interface for playlist management, preference settings, and discovery analytics with real-time updates.
Adding social features for playlist sharing, collaborative playlist creation, and community-based music discovery.
Whether you're a music lover wanting to try the system, a developer interested in the technical implementation, or someone looking to collaborate on similar projects, I'd love to hear from you.