Posted by Glen Sears | November 4, 2017 1:12 pm | No Comments
(Additional contribution by Dan Charlson & Amy Vandergon)
Structuring data so it can be easily interpreted and transported is critical for modern digital music infrastructures. The most common structured data solutions are called Markup Languages, which take an exceptionally simple approach to structure.
Markup languages simply mark sections of a document (hence “markup”) with a descriptive label called a “tag.” These tags are then used by other software to properly display or ingest that data.
Markup languages typically consist of regular words rather than code syntax and symbols. This makes them more user-friendly, often possible to interpret by eye alone. The two most popular markup languages are HTML and XML. XML is also known by its long-form name: eXtensible Markup Language.
What is XML?
Markup languages don’t actually perform operations. Their job is to describe and organize data for software that understands those descriptions, and can execute the data. HTML is made up of tags (<head>, <body>, <div>, <p>, etc) that were agreed upon when the language was built. Every HTML developer must use only these tags to categorize data, or the software built to interpret and display the data (web browsers) will not render the elements correctly.
XML differs in that it allows a developer to create any tag they like. This is why it is known as extensible. XML developers are not limited to a predefined set of tags to describe elements of data. Data can be organized with any tag imaginable. While XML resembles HTML, its data structuring potential is far greater. In a sense, XML is a framework that allows you to create your own language for describing and transporting data.
Computer systems often contain data in incompatible formats. To move data between these systems, large amounts must be converted and incompatible data is often lost. Industries, organizations, and developer communities agree on XML specifications or standards for this reason. This makes the creation of compatible software programs easy, regardless of how they’re built or where they’re situated.
XML is a common choice for exporting structured data and for sharing data between programs or companies.
How does XML service digital music?
Let’s say Company A wants to send Company B information about a new music album. If they were using email, Company A would simply write the information down in an easy-to-understand human format. Company B could then take that information and enter it properly into their system.
Title – Album Title
Artist – Album Artist
Track 1 – Track Name
Track 2 – Track Name
Track 3 – …
But—what if Company A wanted to send the information from their system directly into Company B’s system? Since the systems are almost certainly not compatible, there must be an intermediary step where the data is re-written in a common language. This is where XML comes in. The two companies agree on a set of XML tags and their hierarchy (referred to as a “schema”), then map those values to their own systems. It may look something like this:
While to a human the data looks virtually identical, digital systems can process data written in a common tongue much more easily. XML allows digital music services to transmit and synchronize massive amounts of content information between incompatible systems easily and accurately.
How does MediaNet use XML?
As shown above, XML is a critical part of digital music data delivery to MediaNet. Without XML feeds, content data being added to or updated in our library would require large databases or spreadsheets. Each addition (or batch of additions) would require a human to package it and send it to us, where another human would then integrate it into our system.
Thanks to XML, this is not necessary. Every time a new piece of data needs to be added to the MediaNet catalog by our Content Partners, it is simply added to the feed and picked up by our system.
MediaNet maintains its own sophisticated XML content ingestion schema. It consists of over 30 top-level elements, expanded into 100s of metadata sub-values indicating data points such as rights, territory, currency, and usage for every track, album, artist, and composer. All told, a typical album can consist of more than 3,000 lines of XML markup—information that ensures data is ingested into our systems accurately and completely.
What challenges does XML pose?
While the beauty of XML is its dead-simple nature, that doesn’t mean there aren’t challenges in using it. Beyond the inherent possibility of coding errors, the most common XML challenge is also one of its most useful features: automation.
Using XML with custom software platforms means the schema must be agreed upon at both ends of the feed. Both tags and the acceptable values inside these tags are built into the automated software that receives them. This automation can easily be tripped up if the XML schema or acceptable values are improperly entered.
Most systems require XML information to be two things:
- Well-formed – meaning that it adheres to the XML spec itself
- Valid – that it properly follows the schema
Errors in either of these categories can cause XML data transfers to stutter or fail entirely.
How does MediaNet help solve those challenges?
Such automation and data input errors can, but don’t have to, halt or destroy XML data transport. MediaNet uses a 3-part system to resolve feed errors, enhance data, and keep transports running smoothly:
- Erroneous data is identified and filtered from the feed into a separate error queue, leaving the rest of the feed free to ingest into our system.
- Our Content Operations Team prioritizes and manages our error queue daily to avoid long-stay metadata errors, and resolves many in the process.
- Our Rights Management Team manually uncovers and verifies additional information and data to ensure rich, accurate entries.
While our systems are automated by using the most effective language for collaborative data exchange, ensuring the highest possible quality of data still requires a human touch.
MediaNet’s data is the cleanest, most accurate data in the music industry because our database is built on an unbeatable combination of enterprise technology and human intuition. We are almost always able to resolve XML feed errors without any need for Content Partner involvement.
XML is the humble technology that drives many of the most successful digital music technologies. Digital music infrastructure costs would rise dramatically without it. The degree of difficulty for integrating systems would increase.
Most importantly, the extensibility of digital music systems to flex and change with need would disappear.
Posted by Glen Sears | May 2, 2016 9:36 am | No Comments
Story of the Week
Apple Music Grows to 13 Million Subscribers
Apple Music has surpassed 13 million subscribers, Apple CEO Tim Cook revealed Tuesday. That represents growth of 2 million subscribers since the company last disclosed numbers in February.
Apple announced the growth in Apple Music subscribers as part of its fiscal second-quarter earnings release, during which it reported its first revenue declines since 2003.
But, the Cupertino, Calif.-based tech giant still has a ways to go before it catches up to streaming heavyweight Spotify. The company boasts 30 million paying subscribers as of March.
Top Music News Stories
Spotify Denies Security Breach After Report of Stolen Passwords, Addresses. According to a report at TechCrunch, some users’ email addresses, passwords and other account information appeared on the Pastebin website.
Future Of Music Coalition CEO Casey Rae On The Value Of Universal Data Standards. “Let’s commit to universally deployed data standards on both sides of the music copyright, common database environments for expedient matching and resolution of discrepancies, along with a protocol for universal information updates when additional data is modified by authorized parties.”
A Surprisingly Interesting Dive Into Classical Music Metadata. Breaking down how such metadata works, and what standards need to be followed in order to ensure that DSP’s classical content remains up to snuff.
Pandora’s First Quarter Financials: Ad Revenue Jumps Along With Music Costs. With revenue jumping 29 percent from nearly $231 million in the corresponding quarter in the prior year, Pandora continues its growth story—but its losses also widened, to $115.7 million from the $48.3 million loss it had in the corresponding quarter in the prior year.
ASCAP Reports $1 Billion in Revenue, Again. Within that, domestic receipts grew to $716.8 million, up 9.3 percent from the prior year’s total of $655.8 million. ASCAP also increased domestic distribution by 6.2 percent, to $573.5 million.
YouTube Changes Content ID to Allow Money Collection During Rights Investigations. Internet video giant YouTube has made a change in its Content ID evaluation process that will benefit creators whose work has been improperly challenged by a rights holder.
Regulatory Filing Reveals UMG’s Massive Effort To Block Pirates. A Universal Music Group filing with the U.S. Copyright office designed to bolster the case that Safe Harbor standards need an overhaul, reveals the lengths that the company went to limit piracy on Taylor Swift’s 2014 release ‘1989.’
Our best wishes for a great week! – MediaNet
Posted by Amy Vandergon | September 16, 2015 1:34 pm | No Comments
Here at MediaNet’s Content department, we spend a lot of time staring at metadata tags. We have become metadata whisperers, noticing genre-wide trends, peculiarities, and common mistakes. We nurture any problem data, and once it’s fixed we release it into our catalog. Over time we have noticed three genres in particular that often need data intervention: Classical, hip hop, and electronic dance music (EDM).
Beyond the metadata similarities, these three genres are all quite different. However, they do share some additional common characteristics, including a heavy reliance on patterns, multi-movement works, and the integration of dance.
Patterns are important in every genre, but perhaps more so for these three. Classical music laid the groundwork of tonality, or the patterns of pitches our ears expect. Baroque fugues, for example, contain tightly-repeated harmonic and melodic sequences. In both hip hop and EDM, patterns manifest through repetitive samples and beat structures.
All three genres have many multi-movement works. Just as any classical symphony should be listened to in its entirety, so too should Kanye West’s The College Dropout or Daft Punk’s Random Access Memories (or any live DJ set, for that matter).
Unique styles of dance, such as the minuet and waltz, were popularized through the classical music tradition. Hip hop has spawned a wide variety of dance styles, including breaking and krumping. An entire subculture of dance has been created through the popularization of EDM. Even more important than the musical similarities of these genres is the ability of each to enact social change, becoming a voice and distraction for the oppressed.
Olivier Messiaen wrote and premiered his Quatuor pour la fin du temps, inspired by the Book of Revelation, while imprisoned in a German POW camp. N.W.A.’s Straight Outta Compton highlighted the poverty, drug abuse, and police brutality that continue to run rampant in that city. EDM is a direct descendant of disco, which began in jazz halls in Occupied France that were only allowed to play recorded music. Disco and early EDM developed largely through the work of homosexual, black, female, and Latino communities – groups which have a history of devalued cultural contributions.
Now let’s look at the practical problems they pose when it comes to metadata tags.
- There are several contributor options (composer, conductor, performer, etc.) and multiple accepted spellings of composers’ names (like Stravinsky/Strawinski/Strawinsky or Schoenberg/Schönberg).
- Artists in hip-hop and EDM often change spellings or have multiple variants (like Jay Z/Jay-Z or Puffy/P. Diddy/Puff Daddy).
- EDM artists often get credited on their own or as part of collaborations (e.g. Axwell, Sebastian Ingrosso, and Axwell Λ Ingrosso).
When it comes to artist names, accuracy is important in all genres. Should the album be attributed to “Drake” or “Nick Drake?” Should the artist name be “Sammy” or “DJ Sammy?” For classical music, the preferred format is typically “First Last.” “André Previn” is correct, rather than “A. Previn,” “Previn, André,” or “Previn”. In some cases, special characters are necessary (e.g. “Béla Bartók” instead of “Bela Bartok”).
What would happen if we didn’t intervene? Every time an album is submitted with a mistake in the artist name, that album will not show in a search for the correct artist name. Let’s use the example of André Previn. Here are some of the name variations that have made their way into our system:
To fix this issue, we look at which spelling has the highest amount of data in our system and compare it with additional research. Our research confirmed that the proper, accepted spelling is “André Previn.” The other records were automatically created by incorrect metadata. As you can see, using the proper é character is important here, as are spelling, formatting, and spacing (poor “AndréPrevin” is afflicted with a missing space).
If a label were to submit an André Previn album under the name “Andri Previn,” it would prove difficult for a listener to find. The album would not show up under the accepted “André Previn” name. We merge these entries so the associated albums show up under “André Previn.” After we correct and merge this metadata, our database automatically reassigns anything further submitted under the names above to “André Previn.”
Our Content Operations Team repairs these inconsistencies every day. With more than 25,000 new artists submitted to our catalog each month, you can see how valuable proper submission is to database integrity. Every incorrectly submitted new artist must be repaired by hand. As Benjamin Franklin said, “An ounce of prevention is worth a pound of cure.”
Accurate data is paramount to good systems and great user experience. We know how much work goes into preparing, recording, and distributing an album. We all care greatly that it is represented accurately in our system. Accurate metadata submission means users will find the albums they want to hear, and that means more earnings for rights holders. Data submitted to us without the need for intervention makes this process go much more quickly. In short, proper submission = better, faster returns.
The MediaNet Blog
Insights, News, Announcements, and Updates on MediaNet and the ever-changing world of music, technology, industry, and law.
Get Weekly Music News Updates Directly To Your Inbox!
- MediaNet Announces Partnership with Vybn to Provide Music Catalog and Backend Services
- October 12, 2020
- 4 Reasons to Use MediaNet for Developing Your Digital Music Product
- December 12, 2017
- Why XML (and XML Accuracy) Is Critical for Digital Music
- November 4, 2017
Powered by WordPress