According to wikipedia, Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. Although the term of open data is new, the philosophy behind that, which I think is sharism, transparency, and alleviating information asymmetry, has long been established. For example, there is always some data and charts available in previous government reports or consumer reports. In the past, the public might not obtain full access to the complete data set. What is in front of them is just partial data with fixed, single interpretation without giving alternatives or the right to directly play with data. This pre-selected disclosure approach might introduce ambiguity and bias. With the advance of technology, civic awareness, and public engagement, nowadays there is much more data available online which belongs to a variety of industries and policy areas. A simple question is, is it good enough? Let's take education, especially higher education as an example.
1. Examples of current practice in the U.S.
- The MyData Initiative, which is part of Ed.gov, seeks for every student (or parent of an underage student) to have access to his or her own academic data in a machine-readable format.
- College Affordability and Transparency Center, initiated by the U.S. Department of Education, which mainly presents data on the financial side of pursuing education
- NCES, The National Center for Education Statistics, the primary federal entity for collecting and analyzing data related to education. It has a section called College Navigator, which has data of nearly 7,000 colleges and universities in the United States. Especially, it has various search criteria to help users find "the right college" for themselves.
Basic searching function contains location, institution type, etc.
The advanced search contains much more detail information such as Test Scores – 25th Percentile, % of Applicants Admitted, Housing, etc.
2. Examples of current practice in China
In China where I come from, open data and open government is on its way to more public attention. On February 1st, Minister of Education Yuan Guiren said on the National Education and Scientific Research Conference that, we should make more data accessible to education and scientific research institutions, to enable a better environment for education and scientific researches.
- Research Database of Higher Education in China is an online database initiated by Higher Education Development Center of Xiamen University. It contains survey data from 11 province over 5 years of undergraduate students.
- Chinese Higher Education Information Searching System is a platform initiated by one of the biggest and most powerful Internet company in China: Tecent. It has similar searching tool bars as shown above. High school graduates could put in some of their personal information (like their estimated score in National University Entrance Examination) and choose some filters to get a recommended school and/or major.
- Chinese Higher Education Student Information System is an active online achieve of university students' information since the year 2001. One way to use the website is for the students to testify their degrees and important national certificate information to their potential employers.
3. An imaginary searching engine in the future
Previously we took a look at the current practice of two countries. Although the U.S. has more advanced practice and more transparency in general, there is still a long way for both to go. Having a personal tracking system which records one's education performance information from the very beginning, which could also provide timely feedback and generate periodically reports might be a good goal.
However, here in this blog I do not intend to write too much about how open data could potentially help improving education quality and assessment methodology. Instead, I am trying to envision the future from another perspective: looking at education especially higher education as an investment. If people are investing 4+ years of their time and plenty of money on getting higher education, how could they make more informed decision with the help of open data?
From Joel Gurin's presentation on how to use open government data to empower consumers, one of the three principles of smart disclosure design is that, the data should be as usable as possible. However, different people have different standard when it comes to "usable". Data that a scientific researcher or a graduate student takes as usable might seems still hard to make sense from, for a high school student or their parents who have not stats background yet want to choose a university program wisely.
In my opinion, one solution is to leverage the power of crowd sourcing. Either the government or certain big education institution could encourage people who have statistic skills and have passion for improving education as a whole, to contribute in interpreting the huge volume of data sets published online. Charts, PowerPoint slides, infographics, data visualization...all kinds of tools that could make the data easier to understand for choice makers are warmly welcomed. Of course with one more step of interpretation between raw data and consumption, it is inevitable that some personal perspective or bias might generate. However this kind of problem would be much less significant comparing to the situation in the past.
In the past, the situation is:
Raw data (not available) —> data owners interpretation which might be biased or even misleading —> the public
Now the situation is:
Raw data (open online to everyone) —> data volunteer's alternative interpretations which might also be biased yet could be cross-checked —> the public
Especially, the mechanism of crowd sourcing would also largely alleviate the risk of manipulating data interpretation by a certain relevant organization which could benefit a lot financially from misleading data. Education, although is a personal choice, could has great positive externality on our society. Thus it is better for us to avoid it being under entire control of money and market power.
Imagine one day in the future, a high school student is deciding which university to apply to. He or she will go to a special searching engine where all sources of higher education related data is well-interpreted and integrated. With basic information input, he/she could know which school is financially affordable; which major is more tailored to his/her interest, or his/her learning style/cognitive traits; what kinds of career is lying ahead of certain school/major choice; if he or her is interested in life in a specific city, which school has strongest alumni network in that city...all information is presented in such a user-friendly way that no one would suffer from blind choice caused by failure of searching for and reading data. How wonderful is that!