Boris leaves me excellent links from time to time in my del.icio.us account! I usually find them when i am in those in-between times, usually idling between jobs, that’s when i recall to go over and see what’zup and find lovely info gifts in the Links For You section. This time he left a delightful info present about an exquisite way to make the numbers tangible from the artistic expressions of Chris Jordan in his Running the Numbers photo exhibit.

This new series looks at contemporary American culture through the austere lens of statistics. Each image portrays a specific quantity of something: fifteen million sheets of office paper (five minutes of paper use); 106,000 aluminum cans (thirty seconds of can consumption) and so on. My hope is that images representing these quantities might have a different effect than the raw numbers alone, such as we find daily in articles and books. Statistics can feel abstract and anesthetizing, making it difficult to connect with and make meaning of 3.6 million SUV sales in one year, for example, or 2.3 million Americans in prison, or 426,000 cell phones retired every day. This project visually examines these vast and bizarre measures of our society, in large intricately detailed prints assembled from thousands of smaller photographs.

I luv how he plays with scale and patterns to represent the tyranny of our mass consumption (see Plastic Bottles, 2007) or his choice of materials (see Building Blocks, 2007) to symbolize an issue.

Chris Jordan Shipping Containers 2007

Here are some of the photographic themes his photos depict:

  • nine million wooden ABC blocks, equal to the number of American children with no health insurance coverage in 2007.
  • 8 million toothpicks, equal to the number of trees harvested in the US every month to make the paper for mail order catalogs.
  • two million plastic beverage bottles, the number used in the US every five minutes.
  • 65,000 cigarettes, equal to the number of American teenagers under age eighteen who become addicted to cigarettes every month.

Material and consumption culture is frighteningly beautiful in his photos. My favorite is the

  • 75,000 shipping containers, the number of containers processed through American ports every day (Photos in this post).

Chris Jordan Shipping Containers 2007

That’s allot of stuff moving from place to place!

What is the cost to taxpayers of public institutions purchasing public data? As citizens we do not like to pay for the same thing many times. So here is a real scenario and an estimated best guess of the #s on the cost to taxpayers for public data which they pay for many times via their public institutions whose job it is to work for the public interest and re-purchase data citizens have already paid for once in taxation:

a) Each Canadian municipality, city or town purchases demographic data from Statistics Canada. Lets suggest there are approximately 2000 of these entities. Lets say they each purchase a subset of the Census at varying scales, with a specialized geography to match their boundaries, so lets say they each spend conservatively $ 10 000 each (factoring that some small towns will buy less and others more).

2000 Towns/municipalities/cities * $ 10 000 = $ 20 000 000

b) Since many cities/towns/municipalities do not have efficient data infrastructures to manage their data assets, sometimes different departments purchase the same data twice or three times. So you may get planning, health and social welfare departments each purchasing the same data and not sharing as they are unaware and there is no central accessible repository they can mutually search. So lets pretend that the top 100 (conservative #) cities in Canada purchase the same/similar data 3 times each. We already included one purchase once above but we will keep to 3 as potentially some have purchased 4 times while the other 2900 units may have done so at least once.

100 Towns/municipalities/cities * 3 (duplicate copies of the same data) * $ 10 000 = $3 000 000

c) The best part, often each of these Towns/municipalities/cities are purchasing data for their entire respective provinces as they wish to do some cross comparisons. This means that each of these entities is paying each for the exact same/similar data set each time! Dam! Talk about a non-rivalrous good and how smart is StatCan? Dam we thought the public service did not have a corporate mindset!

d) The Provinces and Territories also each purchase Census data. They do not necessarily have a centralized data infrastructure either, they have bigger bureacracies, more departments, more specialized needs and bigger data requirements. So lets suggest that each Province and Territory spends $ 15 000 * 5 duplicate/similar sets, and an additional each $ 10 000 on multiple special orders between censuses.

13 Provinces/Territories * $ 15 000 * 5 = $ 975 000

13 Provinces/Territories * $ 10 000 = $ 130 000

d) Again many of the Provinces and Territories will purchase National scale datasets for comparison purposes, which like Towns/municipalities/cities are purchasing the exact same/similar copy of the exact same/similar data sets for the exact same geography numerous time. Recall the great part about information is its non-rivalrousness! We can each consume the same entity many times and none will suffer as a result. Unless of course you are a Canadian Tax Payer.

e) Then we have the Federal Government with approximately 350 departments and agencies and lets say each purchases some city data, some provincial data and a whole bunch of national data for $ 17 000 each. Then many, lets say 175 of these departments and agencies are purchasing special ordered data set to meet their particular needs, each at $ 7 500.

350 Federal Departments and Agencies * $ 17 000 = $ 5 950 000

175 Federal Departments and Agencies * $ 7 500 = $ 1 312 500

TOTAL:

  1. 2000 Towns/municipalities/cities * $ 10 000 = $ 20 000 000
  2. 100 Towns/municipalities/cities * 3 (duplicate copies of the same data) * $ 10 000 = $3 000 000
  3. 13 Provinces/Territories * $ 15 000 * 5 = $ 975 000
  4. 13 Provinces/Territories * $ 10 000 = $ 130 000
  5. 350 Federal Departments and Agencies * $ 17 000 = $ 5 950 000
  6. 175 Federal Departments and Agencies * $ 7 500 = $ 1 312 500

Grand Total of Census Data Expenditures by Taxpayers via Public Institutions in Canada: $ 31 367 500

The above is conservative number as it does not include the human resource expenditures like the following:

  1. Person hours associated for each public servant to negotiate and discuss their data needs
  2. Person hours for the StatCan officials to fill in the orders
  3. Person hours of the public servant lawyers to take care of licensing
  4. Person hours associated with all of the purchasing and accounting work to pay for, acquire and account for this money
  5. Person hours for each official who has to work the data in the same way to meet their needs
  6. Dunno if public agencies pay taxes on these! That would add insult to injury would it not?

It is also important to note, that hospitals, school boards, universities, crown corporations and a host of other quasi public institutions are doing the same thing. And that these numbers are only for census data, these do not include the cost of other datasets like road networks, water quality, maps, environment data and so on.

Would seem to me that we could spend a fraction of that cost to deliver the data online to all of these institutions, private sector, NGOs, and Citizens and we would all be better off financially. We would waive all the administration costs, and the license management costs, and we would all be smarter to! Further, we could reinvest that money into more research, air quality infrastructure, healthcare, waive recreation fees in municipalities etc. We could reinvest wisely in quality of life and know more how to do so at the same time.

PS-If anyone has:

  • come across any type of cost analysis reports etc.
  • has a better way to calculate this
  • knows of some real costs

Please pass them along! The more we have on this the better.

Says Jesse Robins:

The Istanbul Declaration (see: pdf) signed at the [OECD World Forum] calls for governments to make their statistical data freely available online as a “public good.” The declaration also calls for new measures of happiness and well-being, going beyond just economic output and GDP. This requires the creation of new tools, which the OECD envisions will be “wiki for progress.” Expect to hear more about these initiatives soon.

From the Declaration (pdf):

A culture of evidence-based decision making has to be promoted at all levels, to increase the welfare of societies. And in the “information age,” welfare depends in part on transparent and accountable public policy making. The availability of statistical indicators of economic, social, and environmental outcomes and their dissemination to citizens can contribute to promoting good governance and the improvement of democratic processes. It can strengthen citizens’ capacity to influence the goals of the societies they live in through debate and consensus building, and increase the accountability of public policies.

Hear hear!

Looks like some of us are using less pesticides, purchasing a few more energy efficient and water conservation devices, composting only very slightly more than before, and it seems we dunno what to do with our toxic waste, we still throw out medicines and electronics in the regular curb pick up and we still commute to work one person per car which is too bad since

Passenger transportation accounts for about 12 per cent of Canada’s greenhouse gas emissions and efforts to improve efficiency are a high-profile part of the global warming debate.

Also, sadly we drink way more bottled water than is necessary in a country with an excellent drinking water infrastructure.

It would be great to get a hold of the raw data and play with it. It could be mapped and studied with other variables like income, city versus rural, ethnicity, mother tongue, population density, etc. This type of analysis could help target campaigns in certain under-performing areas and study why others are doing better.

Sources:

Putting Canadian “Piracy” in Perspective, a video from Geist and Albahary is a great way to present an argument. In Geist’s words

over the past year, Canadians have faced a barrage of claims painting Canada as a “piracy haven.” This video – the second in my collaboration with Daniel Albahary – moves beyond the headlines to demonstrate how the claims do not tell the whole story.

The video also uses quite a bit of public and private sector data to support its argument. This to me is what public data are for and this is what democracy looks like – when civil society has access to the data it requires to keep its government accountable, can keep citizens informed and can temper industry desires with public interest!

One of the cultural issues that has become pervasive as of late is the proliferation of policies and decisions being based on assumptions and not on facts, and in the case of the very powerful lobby against Canada on IP in the cultural sector – really biased reports that are not based on facts but on an industry’s desires and self interests. Look for the sources of the data and the methodology in all reports. Even in this great video! Geist and Albahary do a great job in this to show what is being said and repeated (memes) about the cultural industry in Canada and reality.

It is interesting that the video ends with a slide acknowledging the photos used, the music heard, the creators of the video and the license but not all the data sources in the charts! Some of the data references are in some of the bar charts while most statements are referenced with their source at the bottom of the slide. I always look for data references, else how can I go back and verify what was purported!

The data in the charts were:

  • Hollywood Studio Revenue Growth – Data Source unknown
  • Top Hollywood International Markets – Data Source unknown
  • Canadian Music Releases – Statistics Canada
  • Canadian Artist Share of Sales – Canadian Heritage Music Industry Profile
  • Digital Music Download Sales Growth – Data Source unknown
  • Private Copying Revenues 2000-2005 – Data Source unknown
  • RCMP Crime Data – Data Source unknown but assume the RCMP

*************************************
NOTE: See the comments of this post, the references to the data, quotes and reports that were not listed in the credits or with the information in the film are now fully described on Michael Geist’s Blog here.

Datalibre.ca received and excellent comment on the DLI post about access to some of the Statistics Canada data in schools and public libraries. Today I am looking at E-STAT online and am quite impressed – but alas I have not yet gone to a public library to check out what is actually there and what I can do. Nor do I know the limitations of CANSIM data. I did however speak on the phone with a fine librarian at the Main Ottawa Public Library this morning and look forward to digging for data later on today or tomorrow.

E-STAT is:

Statistics Canada’s interactive learning tool designed with the needs and interests of the education community in mind. E-STAT offers an enormous warehouse of reliable and timely statistics about Canada and its ever-changing people.

Using approximately 2,600 tables from CANSIM*, track trends in virtually every aspect of the lives of Canadians. Updated once a year during the summer, CANSIM contains more than 36 million time series.

Hundreds of schools across the country and Depository Service Program Libraries make these data accessible if you go in person to access them. You can get access to these data online only if you are registered with one of these institutions.

The E-STAT license on the data are quite restrictive.

The Government of Canada (Statistics Canada) is the owner or authorized licensee of all intellectual property rights (including copyright) in the data product referred to as E-STAT. Statistics Canada grants the educational institution a non-exclusive, non-assignable and non-transferable licence to use the data product subject to the terms below.

The data product supplied under this agreement shall at all times remain under the control of the institution. It may not be sold, rented, leased, lent, sub-licensed or transferred to any other institution or organization, and may not be traded or exchanged for any other product or service. The data product may not be used for the personal or commercial gain of any authorized user, nor to develop or derive for sale any other data product that incorporates or uses any part of this data product.

The data that are made available are Yearly updated Canadian Socio-economic Information Management System (CANSIM) data, the daily updates are sold for commercial purposes. I am also not sure how fine the geography is for E-STAT data, for instance if the data are available by Dissemination Blocks, Dissemination Area or, Census Tract, or Urban Areas (Note the cost associated with these and other maps). These make a difference, since DB is the finest granularity, DA is a larger neighbourhood level while CT covers a larger areas, while UAs are larger still. Each scale is for a different level of analysis and the boundaries if you aggregate any of these do not necessarily line up. Additionally, DB and DA are only for the 2006 Census while CT and UA are for others. I am guessing E-STAT is CT Scale data and larger.

E-STAT also has some census data, agricultural data, aboriginal survey data, some environmental data and health behaviour data for school aged children. Clearly not all the data are available and certainly not the specialized surveys such as business, waste management, household spending surveys, health, the survey of particular sectors etc. The data come with explanations, and teachers and users guides.

Lets see what we can get once I make a visit!

Another great American project, Fedspending.org is:

a free, searchable database of federal government spending…. With over $14 trillion in federal spending, this more open and accessible tool for citizens to find out where federal money goes and who gets it is long overdue. We believe this website is a good first step toward providing that access.

The project is run by OBM Watch, a “a nonprofit government watchdog organization located in Washington, DC. Our mission is to promote open government, accountability and citizen participation.” Funded by the very busy Sunlight Foundation.

United Nations Common Database (UNCDB) … “provides selected series from numerous specialized international data sources for all available countries and areas.”

Even better:

As of 1 May 2007, use of the Common Database will be FREE OF CHARGE. No subscription will be necessary after that date, and any user can enjoy the full range of data, metadata and various search tools without restriction.

Does anyone know of any exciting applications of these datasets?

Jon Udell has been writing about public data a fair bit of late (and he’s agreed to do an interview with us, coming sometime soon). In his latest post, he puts into practice an interesting theory, that good data presented in the right way is a kind of performance art. He demonstrates with a recent hobby horse of his, crime data from his hometown of Keene, which he runs through in a screencast with narration.

Jon’s inspiration for this style of presenting data is Hans Rosling, whose past two TED Talks made data sexy for many who never thought they might consider sexy and data in the same universe.

What Rosling and Udell are illustrating is the sort of thing that governments don’t seem to have time or interest in doing: presenting data in a way that average people can grasp. By doing that, our communities will necessarily become much better at making sensible decisions, for instance about how and where to spend money. There is no reason why governments can’t be doing this too … but more importantly, there is no reason why taxayers should not get access to this kind of data. With the data, citizen can find new and innovative ways of displaying and using the data (meaning the government doesn’t have to), which, if one has faith in data, people and democracy, should translate to better decision-making in the community.

I tripped over this yesterday while looking for some arguments for and against cost recovery. The arguments are quite good and comprehensive. If any of you can think of more send them to the civicacces.ca list or leave comments here.

This texte I believe was put together by Jo Walsh and colleagues as they were preparing positions for the INSPIRE Directive that became official May 7, 2007. Public Geo Data put together a great campaign, an online petition, a discussion list and superb material to lobby EUROGI for Free and Open Access to Geo Data. At the time the UK was pushing heavily for the Ordnance Survey‘s extreme cost recovery model for the EU while other European nations were working towards more open and free access models. You can read more about it by going through the archive of their mailing list.

Here is the full text for Why Should Government Spatial Data be Free?

« Older entries § Newer entries »