Developer?
10x10 is here for you.
If you're an artist or developer interested in information
visualization, 10x10 can be a great data resource for you
and your work. As an information artist myself, I understand
the difficulty of finding interesting and timely data sources
on the web. 10x10 hopes to help this problem. Every hour,
10x10 gathers the 100 most important words and pictures in
the world, based on what's happening in the news. You are
welcome to use the information produced by 10x10 in your own
non-commercial projects.
Displaying these pictures in a 10x10 grid, as 10x10 does,
is just one application of the data. There are countless other
ways to use this data -- analyzing how words and pictures
come in and out of the news over time, studying world trends
based on what's happening in the news, creating picture-based
worldviews -- your imagination is the only limit.
10x10 has been designed to make it easy for developers to
use the data it produces. This page explains the basic information
architecture of 10x10, and how you can go about using its
data.
Information Architecture.
The data 10x10 produces is structured in a series of folders,
all online and publically accessible. The folders are named
in accordance with their year/month/day/hour, in the following
manner:
- Standard location of a 10x10 data folder for a single
hour:
http://tenbyten.org/Data/global/YYYY/MM/DD/HH/
- For example, the word and picture data for November 5,
2004 9am would be stored at:
http://tenbyten.org/Data/global/2004/11/05/09/
Within the folder for each hour, you will find 200 images
(each word has a full size image and a thumbnail image) and
a wordlist file titled "words.txt". The images are
all JPEGS, and are all titled in the following manner:
- "iraq.jpg" - Full size (227x149
pixels) image for the word "iraq".
- "iraq2.jpg" - Thumbnail size
(60x40 pixels) image for the word "iraq"
To find out the top 100 words for the hour, in ranked order,
consult the "words.txt" file in
the hour's folder. "words.txt"
files have 100 lines, with one word on each line. The #1 (most
important) word is on line 1, and the #100 word is on line
100. The lines end with the newline character ("\n"),
with no spaces or punctuation. Here is a sample
"words.txt" file.
You can easily parse these "words.txt"
files, line by line, using any standard scripting language,
such as Perl or PHP.
Images of Day, Month,
and Year
In addition to gathering data for each hour, 10x10 concludes
the top 100 words and pictures for every day, month, and year,
based on word and picture popularity in that timeframe. The
data for days/months/years follows the same naming conventions
as outlined above for hour data. For example, the data folder
for the top 100 words/pictures for November 5, 2004 would
be located at: http://tenbyten.org/Data/global/2004/11/05/
The data for the top 100 words/pictures of November, 2004
would be located at: http://tenbyten.org/Data/global/2004/11/
And the data for the top 100 words/pictures of 2004 would
be located at: http://tenbyten.org/Data/global/2004/
The day/month/year folders use the same "words.txt"
ranking system as outlined above.
Obviously, top data for days/months/years is not available
until after the given day/month/year has finished.
- Top DAY data is generated at 12AM of the next day
- Top MONTH data is generated at 12AM of the first day of
the next month
- Top YEAR data is generated at 12AM of the first day of
the first month of the next year
Data for Current Hour
To simplify the process of getting the data for the current
hour, 10x10 keeps some relevant current information in the
directory: http://tenbyten.org/Data/global/Now/.
In this folder, you will find the following files:
- "words.txt"
-- the current top 100 words, as explained above.
- "now.jpg"
-- a single JPEG of the 10x10 grid for the current hour.
- "date.jpg"
-- the current hour printed as an image.
- "dateString.txt"
-- a one line text file containing the directory
link to the current hour (e.g. "2004/11/05/09")
TECHNICAL
NOTES:
1) Since 10x10 typically takes
around 5-10 minutes to run, data for a given hour is
generally not available until 5-10 minutes past the
hour.
2) When calling one of 10x10's data directories, you
MUST append the trailing backslash, or the Apache server
might not recognize the request. For example, to access
the data for November, 2004, you must use: "http://tenbyten.org/Data/global/2004/11/",
as opposed to "http://tenbyten.org/Data/global/2004/11".
3) All data times and dates are based on Eastern Standard
Time (EST). |

Attribution.
You are welcome to use the data produced
by 10x10, but if you do, please include a link back to 10x10
(http://tenbyten.org)
on your site. Thanks!
Please note that 10x10 does not hold the
rights to any of the images that appear on this site. The
images come from several leading international
news sources, and those sources retain all rights to their
images. The photographs are used by 10x10 strictly for non-commercial
purposes.
Contact.
You can contact Jonathan Harris by mailing: jjh
"AT" number27 "DOT" org
|