Jan 25 2012

Does the Web Need a .data gTLD?

Debate continues over custom gTLDs as Stephen Wolfram ponders the utility of .data on the web.

The decision by ICANN to allow for custom generic top-level domains has caused ripples in the tech community and beyond. The FTC recently wrote a strongly worded letter condemning the decision, but still ICANN pressed forward with its plan.

Stephen Wolfram, a world-renowned British computer scientist, recently pondered the usefulness of a .data gTLD in a blog post. As inventor of the Wolfram Alpha, the answer-engine that Apple’s Siri taps into for its intelligent responses, he sees potential in a .data gTLD that would be aimed at automated systems, as opposed to humans or search engines. Wolfram writes:

There are product catalogs, store information, event calendars, regulatory filings, inventory data, historical reference material, contact information — lots of things that can be very usefully computed from. But even if these things are somewhere on an organization’s website, there’s no standard way to find them, let alone standard structured formats for them.

My concept for the .data domain is to use it to create the “data web” — in a sense a parallel construct to the ordinary web, but oriented toward structured data intended for computational use. The notion is that alongside a website like wolfram.com, there’d be wolfram.data.

If a human went to wolfram.data, there’d be a structured summary of what data the organization behind it wanted to expose. And if a computational system went there, it’d find just what it needs to ingest the data, and begin computing with it.

But Paul Miller of CloudAve takes an opposing view. He sees a .data gTLD as only adding to the confusion, and as he points out, many .gov, .edu and .com sites are using data subdomains to achieve essentially what Wolfram is arguing for. Miller writes:

Data without context is far less valuable than data with context. Much of that context may be inferred from the domain in which the data lives, with data delivered from a .gov or .edu (or .gov.uk or .ac.uk) site perhaps interpreted differently to data hosted on .com, .biz, or .xxx.

Southampton University, the Open University, and the US Federal Government are able to gather data up and make it available for download via their existing data sites if they choose. This offers human visitors to their sites a degree of convenience, whilst retaining the power and brand attributes of their existing domain.

Gov.data, gov.uk.data, open.ac.uk.data, southampton.ac.uk.data, though? All are messy, in ways that Wolfram’s own wolfram.data would admittedly not be, and all are simply additional registrations that the institutions would have to pay for in order to stop someone else from grabbing the domain.

The idea of standardizing an area for data on the web is a powerful concept. A retailer could easily set up a .data version of its site and allow application developers to tap into its inventory and catalogues, further fueling innovations in the online-shopping experience.

What do you think? Is the effort best left at the subdomain level, or does big data warrant its own gTLD?