I was very impressed by the friendship map made by Facebook intern, Paul Butler and I realized that I had access to a similar dataset at Science-Metrix (an old employer I left a while ago). Instead of a database of friendship data, I had access to a database of scientific collaborations. Bibliometric firms use this kind of data to get a (very) approximated view of science, but I thought that for a data visualization, it was good enough
This post is now obsolete, please see the new one (click here!)
From this database, I extracted and aggregated scientific collaboration between cities all over the world. For example, if a UCLA researcher published a paper with a colleague at the University of Tokyo, this would create an instance of collaboration between Los Angeles and Tokyo. The result of this process is a very long list of city pairs, like Los Angeles-Tokyo, and the number of instances of scientific collaboration between them. Following that, I used the geoname.org database to convert the cities’ names to geographical coordinates.
The next steps were then similar to those of the Facebook friendship map. I used a Mercator projection to project the geographical coordinates onto the map and used the Great Circle algorithm to trace the lines of collaboration between cities. The brightness of the lines is a function of the logarithm of the number of collaborations betweena pair of cities and the logarithm of the distance between those same two cities.
A high resolution map is available here: http://collabo.olihb.com/collabolinks.jpg. Please don’t hotlink.
A zoomable very high resolution map can be consulted there: http://collabo.olihb.com/
Awesome stuff !! Amazing job !
Would it be possible to put a map underneath ?
It’s a bit complicated because the projection I used had a little bug in it. Nothing drastic, but I can’t use photoshop to overlay a map. I’ll see if I have time next week.
Before I even saw these comments I saved the image as a TIF and was able to georeference it to a map using mercator projection. I did notice the projection seemed off, glad you reaffirmed my hunch above. If you want me to send it to you i can. let me know. Although i wasn’t able to use the super high quality zoomable image, so once you get in pretty close to cities the resolution is poor. But country wide, its great to see city names.
Hi, Olivier.
I’ve just found this image and was really impressed with it. Such a brilliant example of geovisualization.
Couldn’t resist playing with it, guessing the familiar cities. But finally grounded my arms, georeferenced this map and converted it to Google Earth kmz-file:
http://topoaxis.com/blog/collab.kmz
Would you mind if I use this image as an example in an upcoming article on web mapping? With all the credits, of course.
Wow, that’s really good. No problem, you can use my image.
Hey Olivier,
I wanted to ask the same thing. We are building a project that could use your visual as an example.
Cheers, and thanks for this impressive piece of visual data.
Marko
Wow, this is really amazing, I like “maps” like these! Just zoomed in and was able to name the cities just by size and position compared to other knots.
True. I did the same exercise for the portugal and one can actually figure out the cities just by the relative positions of the knots. Congratulations to the author.
Where did your raw data come from? Is your groomed version available for download?
Thank you. I’m very sorry but I can’t release the dataset. It’s from Elsevier and they have a very restrictive license and tons of scary lawyers. But you could do something similar using an open(but less complete) dataset like this one: http://www.informatik.uni-trier.de/~ley/db/
Can you talk more about your source of data, or perhaps release it? I’d be interested in seeing a significantly higher resolution map that zooms all the way down to the level at which you can see specific universities and trace each of their connections.
Nice job on the maps, they’re terrific!
Very nice!
What did you use to plot this?
A mix of SQL, java and Photoshop to adjust levels.
An interesting map and fantastic visualisations. Would we able to use it in personal presentations, with acknowledgement of course!
Former colleagues in governemnt would love this and some in the Research Councils might be able work with you to develop further.
as a quid pro quo have a look at my innovation map http://bit.ly/f1miKC
Very nice, I’ll contact you as soon as possible.
Is this English language only?
It contains only journals aggregated by Elsevier, so I would guess that the big majority of the journals are in English.
R or processing?
Neither, Java. It might not be the best language for this kind of work, but it’s the one I’m more comfortable in.
EU seems to have more science going on than US. Is this borne out in Nobel prizes, and other prizes?
The distances between universities are less in Europe than in America so collaborations are denser in Europe than in America, that’s why they’re brighter.
The universities might be closer [here in Europe], but they’re often worlds apart in terms of languages and cultures…
Thanks for your post. I’ve used it to make a more detailed view on position of the Czech Republic regarding scientific cooperation in Europe – http://blog.bzzzwa.net/post/2963331149/scientific-collaboration-europe-czech-map
Very nice. The next iteration of this map might include an interactive tool and an overlay.
i do the same but for Italy
I used qgis to gereference your image
http://de.straba.us/2011/02/08/mappa-della-collaborazione-scientifica-fra-ricercatori-zoom-sullitalia/
Sorry …. the post is in italian language
Beautiful map. I am wondering if you could create something that would give insight into developing world collaborations, eg.one could focus on Africa and filter out all collaborations that are not with Africa, and see who is achieving what there,in terms of both south-south and north-south collaborations.
Thanks, I’ll keep your suggestions in mind for the next iteration of this map.
It would be interesting to normalise it by number of people in a country. The Netherlands is very highly populated and this might be partly contributing to the fact that it appears to brightly.
Don’t forget that this only covers English language publications. That might also account for both the Netherlands and the UK. The Dutch are known for their English language skills.
Is it possible that the French, Germans and Italians prefer their own languages in this context?
I don’t think so. Many major publications in Germany are in English, especially about major discoveries. We want to discuss this with the international community after all! On the other hand, reviews or summaries might be in German, but that’s just a small part for all I know (physics).
Wonderful work. It would be interesting to have a way to see the “strength” of the links, say LA collaborates more with Tokyo than with Paris, so the LA-Tokyo distance is shorter than LA-Paris; this would give a visual idea of how collaboration are based on geography, and how much geography affects research.
We’re presenting a paper on scaleless collaboration indicators at the ISSI in Durban that you might find interesting. I’ll post it here as soon as it’s out of the embargo.
It’s nice but it would be good to be able to zoom in even more. There are many light blobs in big cities where it’s impossible to tell what’s going on.
It’s an interesting picture, but it tells more about Elsevier journals than “scientific collaboration”.
The reason that Holland is so bright is simply that that’s where Elsevier is based. Conversely, Russia is so pale just because the government stimulates publishing in domestic state-owned journals by paying royalties. Countries such as China an Poland are rather pale because people there also tend to publish more in domestic journals; and that is also clearly the reason why the US is paler than Europe. (Distances within California and within Northwest are just like in Europe, so that’s not the problem.) So unfortunately your map provides very little clue to how scientific collaboration actually works worldwide.
But it does say something interesting about Elsevier. For instance, consider Florida. Miami area is very bright, and I can locate Orlando, Tampa/St Petersburg area and Tallahassee. But the main research center in Florida is Gainesville, which I can’t see at all on your map! Any university ranking will tell you that University of Florida is far ahead of any other Florida institution; the next one is Florida State Uni, at Tallahassee, which is still quite pale on your map compared to the resorts of South Florida. Leaving aside universities, Jacksonville, which is a considerable hi-tech business center in Florida, is also not on your map. Should I conclude that higher-level researchers in the US (say from top 100 universities and from some leading companies) publish mostly in domestic and Springer journals, and only really low-profile researchers and amateurs go with Elsevier journals?
Finally, I think that countries like New Zealand and Australia are disproportionally pale mostly because of your logarithmic scale of distances. This seems to indicate that not only your data but also your formulas are problematic. I suspect that you didn’t like the linear scale because it seemed to produce messy results; then you thought, OK, let’s do a logarithmic scaling. But if you think about what can contribute to messiness, it should be about distances versus areas. So I suspect that a quadratic (or perhaps cubic) scale could be more appropriate.
Thank you for your comment but I never intended this map to be a S&T tool. It sparked discussion in mainstream(and scientific) press. That’s enough for me.
Science-Metrix(My employer) just submitted a paper to the 2011 ISSI in Durban about a new scaleless collaboration indicator that takes care of scaling worries in collaboration measures. I will post it here when I get the go-ahead.
Btw, Elsevier’s Scopus and Thomson’s Web of Science are the main databases used in bibliometric studies. Both are missing journals, but they are adding more missing journals every year. No data sources are perfect.
That’s enough for you? Well indeed your blog is getting a lot of attention this way, but are you selling something that you own? You’re positioning your map as an indicator of scientific collaboration, and that’s the message people are getting; clarifications about Elsevier and logarithmic scale are a fine print which predictably they’re not getting.
For instance, the top post I currently see at http://www.guardian.co.uk/news/datablog/blogosphere
is about your map,
http://flowingdata.com/2011/01/27/map-of-scientific-collaboration-between-researchers/
and people do notice that “the brightest country is actually the Netherlands!” but do not notice that it’s all about a publisher that used to be Dutch.
And leaving emotions aside, what information does your map actually contain? For collaborations within countries and subcontinents, one can hope that your picture is not highly affected by incomplete data and worries about scaling. But these local pictures turn out to highly correlate with population density maps, such as those:
http://www.iiasa.ac.at/Research/ERD/DB/mapdb/map_9.htm
http://commons.wikimedia.org/wiki/File:Florida_population_map.png
Perhaps dividing out by the population density would make your picture more informative?
In comparing different regions of the world, I don’t see how your current map makes any sense as a measure of collaboration. Would Japan become brighter than Northwest Europe if Elsevier owned a half of Japanese journals? (Sorry, in my previous post I of course meant Northeast, not Northwest of the US.) Would Ukraine become brighter than Iran if all journals were included? I have no idea…
I understand your grievances about the mainstream press. You can’t expect them to get all the nuances. They are generalists, not specialists.
I plan to work on other versions of the maps later this month and I will keep your suggestions in mind. For the next iteration, I plan on using scaleless indicators to address the population density issues and the distortion caused the sheer number of papers published by the US.
But like I said in another thread, I never expected the map to get so much popular. It’s more of a work in progress than a scientific tool.
I’m glad someone else pointed this out. The blog post as it is gives the false impression that all scientific collaborations occur in a certain subset of English-speaking parts of the world. When the title and description of the post don’t mention the database used, this “map” is kind of misleading.
@Sherwood
I mention the database used (Scopus) on the map.
The Scopus dataset may not be a problem, actually; publications serve as incomplete but easily accessible proxies for the number of research activity, which is rather difficult to measure in reality. I used bibliometric data from Scopus myself, which revealed the general trend, and only a few inconsistencies.
i say, more projects like this should be discussed. cheers!
Olivier’s map depicts “how scientific collaboration actually works worldwide” well, despite methodological considerations of any kinds (e.g., scaling, data base, country population).
It is a mapamundi, but not a city map …nor a Science or Technology Roadmap. No one will make any decision on Science & Technology Policy or budget on the basis of a mapamundi.
Do some one have a much more accurate picture?
Wow. I never intended this picture to be a S&T Roadmap and I don’t know where you got this idea.
It is a pretty picture that sparked discussion on a rarely spoken about subject in mainstream media.
And you succeeded famously. I love that a visualization can send several “someones” down into the trenches of “how you should have done it,” meanwhile missing the forest for the trees.
Fantastic job.
Thank you, I appreciate that.
That is!!! I completely agree with you!!!
Olivier’s map is a very-well-done big picture to see the forest, but not the trees. Of course the map could be improved by taking into account the issues like country population or Springer’s data bases that those who “went down into the trenches of “how you should have done it”” already mentioned. But it doesn`t matter much now.
I aimed to address that:
1) Olivier’s map is good enough for the purpose it was created: to see the forest. Olivier’s map IS NOT a S&T Roadmap for specific decision-making, but IT IS an amazingly powerful tool for the analysis of how the very vast majority of the world’s S&T collaboration is going on.
2) the “someones” who supposedly know how to do Olivier’s work much better, or want to see the trees at Florida in the US or Dar es Salaam in Tanzania: please, do it!
FOR THE AVOIDANCE OF DOUBT: great job Olivier!!!
I’ll be proud to be in contact with you.
very interests
Very nice map, thank you!
I have a little suggestion based on fact that citation is not symmetric like friendship in facebook — make lines painted with gradient, for example, blue-red. Where blue will represent where citation made and red where cited paper published or vice versa.
Oops. I missed that map is built upon collaboration, not citation. Sorry.
Great visualisation! Let me know if I can help on the geography part.
Thanks. You mapmyconnection webapp looks very interesting. I’ll take a closer look when I get back to work next week.
just a fyi: I misspelled the domainname of my expirement of last year: http://www.mapmyconnections.com
just leave it out…
Cheers,
jw.
Is it possible for you to make the bilateral data accesible (Country1-Country2-SomeMeasuresOfCollaborationVolume). This is good bilateral information that would certainly be useful for the academic community.
This is a wonderful work!
Cesar A. Hidalgo (MIT)
Thanks. I would imagine that this kind of data is already available (maybe not in a dataset form) from scientific articles. I’ll see what I can do because it’s not my data.
In the meantime, you might find this type of data journals like “Scientometrics” or the ISSI proceedings.
I am interested in seeing how the same dataset would look on a Pacific-centric map (the kind you see in airports in Asia with the EU on the far left, East Coast US on the far right). Can you do this?
Hmmm… Great idea. I don’t have the time right now to code it.
But if you want to do it in a quick&dirty way, you could open the image in Photoshop, cut the image in the middle and move the left part of it on the right.
It would be nice to have access to the dataset.
But I do understand the limit imposed by your data provider.
But you can at least examine the Java code you’ve written?
Have you thought about doing a temporal analysis?
Let me explain: extract data divided by year in order to create an animation that shows how certain areas have grown into the world of scientific research.
I am very interested in constructing a map like this for my thesis to compare scientific collaborations of my institution to institutions that are comparable to mine. Have you developed a tool that with a given data set, a map such as this can result? I think that this would make a great visual tool for me to examine the success of collaboration at my institution.
Thanks!!
Very impressive. Not least because here in the UK the press and television, when it is mentioned at all, give the impression that only from the USA do we get any sort of scientific activity. This is a refreshing overview. One small point. Only the “high-resolution” image is complete. The others crop New Zealand out of the picture.
Is this copyright protected? I’d like to use it in an upcoming video-
It is.
But if you want to use it in a video, just mention the source and author in the credits.
I really like this picture! Is it available for purchase?
Not right now, but it will be available on sale later this year on this website: http://scimaps.org/
Outstanding! Thanks for sharing your graph.
This is interesting stuff, but takes no account of the (national/ethnic) identity of researchers.
For example, we are researching trans-national research networks among mainland Chinese researchers, from different parts of the world. Is there a way to factor that into the mapping exercise, as far as anyone knows?
Thank you for your comment.
The full address of each researcher is available(including the nationality) in the raw data.
Unfortunately, I cannot release this raw data as it belongs to Elsevier and the terms of the license are very strict. But I plan to release, maybe this summer, an interactive tool to explore collaborations between countries and other clusters.
I’ll send you an email when it’s available.
…Super(be) travail! vraiment très impressionnant! J’attends avec impatience de voir le rendu sur une affiche, je me demande si le niveau de détail pourra être conservé… Encore bravo!
J’ai lancé un essai sur bâche plastifiée 1m sur 1m, grosso modo l’Europe en plein centre qui occupe un carré de 50 cm sur 50 cm, c’est très, très joli.
Fantastic work! Can you throw some more light on Java libs used?
Thank you.
I’ve used the Java Map Projection Library for the subsequent versions of the map to ensure an high level of precision in the projection.
http://www.jhlabs.com/java/maps/proj/
We’re certainly interested on a better understanding and the complex collaborations relationship that takes place among scientist, fields, Universities, Industries, and countries…, that’s why we are here. I guess Oliver, you have started a quite interesting project, regardless the opinions, that I’ve founded most of them positive and to enrich your Map.
Let us know if we can help in anyway, keep going!
Some French geographers made a coment on your map:
http://mappemonde.mgm.fr/num30/internet/int11201.html
Thanks, I just read it. You would think that before writing a long article like that, they would read my blog post.
They say that it’s not a good analytical tool, I never said it was… If you do anything serious with that, you’re mad. It’s only a pretty picture for god’s sake… 🙄
For a serious take on scientific collaboration (and more importantly scale effects), they should read our paper presented at ISSI Durban:
Eric Archambault, Olivier H. Beauchesne, Grégoire Côté and
Guillaume Roberge; Scale-Adjusted Metrics of Scientific Collaboration
Fantastic visualisation Olivier.
I’m a university student looking to create a visualisation based on global shipping container movements. I gather that Paul Butler’s work was created using ‘R’, can I ask what software packages you used to create this?
Thank you Lorcan.
I used a custom java program, but it might be simpler in your case to use R. The Flowing Data blog had a really nice tutorial on how to create similar maps a couple months ago.
Beautiful visualizations, Olivier! I’m also interested in the complex networks that support scientific work and find this project to be a fascinating exploration. I’m looking forward to digging into it myself and seeing where it takes you. I was very excited to meet you at the 4S conference and speak about these a little bit. I wonder how this visualization would look if applied to other networks like science contributions on Wikipedia, or Wikipedia as a whole.
Thanks Stephanie, I just sent you an email.
Hi, these are really nice, do you think that R will produce similar quality results as what you have achieved here with Java?
Very interesting, have you undertaken any analysis of cross-discipline work? If so I’d be very interested to see it
Thank you. This would be very interesting indeed. I’ve quit this job a while ago so I do not have access to the original dataset to do this analysis. Sorry.
This picture is amazing, it could be used to illustrate what we are trying to build : “global collaboration accross organizations”.
Can you suggest a tool to build such map based on collaboration nodes ?
About teamtown : http://collaborate.com/node/310
If you want to discuss collaboration opportunities, please let me know.
I placed the image over a map, and am surprised I don’t really see any lines or links associated with Los Alamos National Laboratory (at least less than Santa Fe). Not that many in Albuquerque, either. Northern New Mexico has about 20,000 people (~50% PhDs) working at 2 major national labs — publishing papers. I would expect to see a very bright line between Sandia and Los Alamos National Labs, as they are physically and academically close. This doesn’t really seem to show up at all on your figure.
The map is really more of an art piece rather than an analytical tool.
If you want to study scientific collaboration in the US, you could use one of my old visualization tool found here: http://olihb.com/2012/01/21/scientific-collaborations-by-metropolitan-statistical-areas/
You can access it by clicking the link at the bottom of the page. While in the tool, you can select the Los Alamos CBSA and the CBSA with the most scientific collaboration will be Albuquerque.
I do not work anymore for the company that provided me with the data used in the map, so cannot update/correct it.