Data Sources

Oral History Transcripts

We’ve utilized over 50 transcripts of oral history interviews in our project. The original interviews come from the Hamilton College Jazz Archive, Rutgers Institute for Jazz Studies Archives, Smithsonian Jazz Oral Histories, UCLA’s Central Avenue Sounds Series, and the University of Michigan’s Nathaniel C. Standifer Video Archive of Oral History.

Hamilton College:

Benny Powell, Benny Waters, Bill Berry, Billy Taylor, Bob Haggart, Buddy DeFranco, Buddy Tate, Buster Williams, Charles Davis, Charles McPherson, Clark Terry, David Murray, Doc Cheatham, Ed Shaughnessy, Gerald Wiggins, Harold Ousley, Herbie Hancock, Jane Jarvis, Jimmy Lewis, Jimmy Owens, Joe Williams, Leslie Johnson, Lionel Hampton, Marian McPartland, Milt Hinton, Mona Hinton, Oscar Peterson, Phil Woods, Red Holloway, Ron Carter, Roswell Rudd, Slide Hampton, Stanley Kay, Vi Redd


Mary Lou Williams


Abbey Lincoln, Annie Ross, Buddy DeFranco, Danny Barker, Dave Brubeck, Delfeayo Marsalis, Jimmy Scott, John Levy, Louie Bellson, Nancy Wilson, Roy Haynes, Toshiko Akiyoshi


Melba Liston

University of Michigan:

Billy Eckstine, Count Basie, McCoy Tyner, Roy Eldridge, Sam Rivers


The seed of our dataset is a list of names extracted from DBPedia, the Linked Open Data version of Wikipedia. At the beginning of our project, we created a starter list of names of jazz musicians. This list, comprised of 9300 names, was generated by filtering the DBpedia RDF extracts for Jazz related individuals.

This list of URIs was used to match against the names our transcript analyzer recognized using Natural Language Processing, so we could assign them URIs.