Map of scientific collaboration between researchers

I was very impressed by the friendship map made by Facebook intern, Paul Butler and I realized that I had access to a similar dataset at Science-Metrix (an old employer I left a while ago). Instead of a database of friendship data, I had access to a database of scientific collaborations. Bibliometric firms use this kind of data to get a (very) approximated view of science, but I thought that for a data visualization, it was good enough
This post is now obsolete, please see the new one (click here!)
From this database, I extracted and aggregated scientific collaboration between cities all over the world. For example, if a UCLA researcher published a paper with a colleague at the University of Tokyo, this would create an instance of collaboration between Los Angeles and Tokyo. The result of this process is a very long list of city pairs, like Los Angeles-Tokyo, and the number of instances of scientific collaboration between them. Following that, I used the geoname.org database to convert the cities’ names to geographical coordinates.
The next steps were then similar to those of the Facebook friendship map. I used a Mercator projection to project the geographical coordinates onto the map and used the Great Circle algorithm to trace the lines of collaboration between cities. The brightness of the lines is a function of the logarithm of the number of collaborations betweena pair of cities and the logarithm of the distance between those same two cities.
A high resolution map is available here: http://collabo.olihb.com/collabolinks.jpg. Please don’t hotlink.
A zoomable very high resolution map can be consulted there: http://collabo.olihb.com/

163 thoughts on “Map of scientific collaboration between researchers”

      1. Before I even saw these comments I saved the image as a TIF and was able to georeference it to a map using mercator projection. I did notice the projection seemed off, glad you reaffirmed my hunch above. If you want me to send it to you i can. let me know. Although i wasn’t able to use the super high quality zoomable image, so once you get in pretty close to cities the resolution is poor. But country wide, its great to see city names.

      2. Hi, Olivier.
        I’ve just found this image and was really impressed with it. Such a brilliant example of geovisualization.
        Couldn’t resist playing with it, guessing the familiar cities. But finally grounded my arms, georeferenced this map and converted it to Google Earth kmz-file:
        http://topoaxis.com/blog/collab.kmz
        Would you mind if I use this image as an example in an upcoming article on web mapping? With all the credits, of course.

          1. Hey Olivier,
            I wanted to ask the same thing. We are building a project that could use your visual as an example.
            Cheers, and thanks for this impressive piece of visual data.
            Marko

  1. Wow, this is really amazing, I like “maps” like these! Just zoomed in and was able to name the cities just by size and position compared to other knots.

    1. True. I did the same exercise for the portugal and one can actually figure out the cities just by the relative positions of the knots. Congratulations to the author.

  2. Can you talk more about your source of data, or perhaps release it? I’d be interested in seeing a significantly higher resolution map that zooms all the way down to the level at which you can see specific universities and trace each of their connections.
    Nice job on the maps, they’re terrific!

  3. An interesting map and fantastic visualisations. Would we able to use it in personal presentations, with acknowledgement of course!
    Former colleagues in governemnt would love this and some in the Research Councils might be able work with you to develop further.
    as a quid pro quo have a look at my innovation map http://bit.ly/f1miKC

      1. The universities might be closer [here in Europe], but they’re often worlds apart in terms of languages and cultures…

  4. Beautiful map. I am wondering if you could create something that would give insight into developing world collaborations, eg.one could focus on Africa and filter out all collaborations that are not with Africa, and see who is achieving what there,in terms of both south-south and north-south collaborations.

  5. Pingback: My Week in Maps
  6. It would be interesting to normalise it by number of people in a country. The Netherlands is very highly populated and this might be partly contributing to the fact that it appears to brightly.

    1. Don’t forget that this only covers English language publications. That might also account for both the Netherlands and the UK. The Dutch are known for their English language skills.
      Is it possible that the French, Germans and Italians prefer their own languages in this context?

      1. I don’t think so. Many major publications in Germany are in English, especially about major discoveries. We want to discuss this with the international community after all! On the other hand, reviews or summaries might be in German, but that’s just a small part for all I know (physics).

  7. Wonderful work. It would be interesting to have a way to see the “strength” of the links, say LA collaborates more with Tokyo than with Paris, so the LA-Tokyo distance is shorter than LA-Paris; this would give a visual idea of how collaboration are based on geography, and how much geography affects research.

  8. It’s nice but it would be good to be able to zoom in even more. There are many light blobs in big cities where it’s impossible to tell what’s going on.

  9. It’s an interesting picture, but it tells more about Elsevier journals than “scientific collaboration”.
    The reason that Holland is so bright is simply that that’s where Elsevier is based. Conversely, Russia is so pale just because the government stimulates publishing in domestic state-owned journals by paying royalties. Countries such as China an Poland are rather pale because people there also tend to publish more in domestic journals; and that is also clearly the reason why the US is paler than Europe. (Distances within California and within Northwest are just like in Europe, so that’s not the problem.) So unfortunately your map provides very little clue to how scientific collaboration actually works worldwide.
    But it does say something interesting about Elsevier. For instance, consider Florida. Miami area is very bright, and I can locate Orlando, Tampa/St Petersburg area and Tallahassee. But the main research center in Florida is Gainesville, which I can’t see at all on your map! Any university ranking will tell you that University of Florida is far ahead of any other Florida institution; the next one is Florida State Uni, at Tallahassee, which is still quite pale on your map compared to the resorts of South Florida. Leaving aside universities, Jacksonville, which is a considerable hi-tech business center in Florida, is also not on your map. Should I conclude that higher-level researchers in the US (say from top 100 universities and from some leading companies) publish mostly in domestic and Springer journals, and only really low-profile researchers and amateurs go with Elsevier journals?
    Finally, I think that countries like New Zealand and Australia are disproportionally pale mostly because of your logarithmic scale of distances. This seems to indicate that not only your data but also your formulas are problematic. I suspect that you didn’t like the linear scale because it seemed to produce messy results; then you thought, OK, let’s do a logarithmic scaling. But if you think about what can contribute to messiness, it should be about distances versus areas. So I suspect that a quadratic (or perhaps cubic) scale could be more appropriate.

    1. Thank you for your comment but I never intended this map to be a S&T tool. It sparked discussion in mainstream(and scientific) press. That’s enough for me.
      Science-Metrix(My employer) just submitted a paper to the 2011 ISSI in Durban about a new scaleless collaboration indicator that takes care of scaling worries in collaboration measures. I will post it here when I get the go-ahead.
      Btw, Elsevier’s Scopus and Thomson’s Web of Science are the main databases used in bibliometric studies. Both are missing journals, but they are adding more missing journals every year. No data sources are perfect.

      1. That’s enough for you? Well indeed your blog is getting a lot of attention this way, but are you selling something that you own? You’re positioning your map as an indicator of scientific collaboration, and that’s the message people are getting; clarifications about Elsevier and logarithmic scale are a fine print which predictably they’re not getting.
        For instance, the top post I currently see at http://www.guardian.co.uk/news/datablog/blogosphere
        is about your map,
        http://flowingdata.com/2011/01/27/map-of-scientific-collaboration-between-researchers/
        and people do notice that “the brightest country is actually the Netherlands!” but do not notice that it’s all about a publisher that used to be Dutch.
        And leaving emotions aside, what information does your map actually contain? For collaborations within countries and subcontinents, one can hope that your picture is not highly affected by incomplete data and worries about scaling. But these local pictures turn out to highly correlate with population density maps, such as those:
        http://www.iiasa.ac.at/Research/ERD/DB/mapdb/map_9.htm
        http://commons.wikimedia.org/wiki/File:Florida_population_map.png
        Perhaps dividing out by the population density would make your picture more informative?
        In comparing different regions of the world, I don’t see how your current map makes any sense as a measure of collaboration. Would Japan become brighter than Northwest Europe if Elsevier owned a half of Japanese journals? (Sorry, in my previous post I of course meant Northeast, not Northwest of the US.) Would Ukraine become brighter than Iran if all journals were included? I have no idea…

        1. I understand your grievances about the mainstream press. You can’t expect them to get all the nuances. They are generalists, not specialists.
          I plan to work on other versions of the maps later this month and I will keep your suggestions in mind. For the next iteration, I plan on using scaleless indicators to address the population density issues and the distortion caused the sheer number of papers published by the US.
          But like I said in another thread, I never expected the map to get so much popular. It’s more of a work in progress than a scientific tool.

          1. I’m glad someone else pointed this out. The blog post as it is gives the false impression that all scientific collaborations occur in a certain subset of English-speaking parts of the world. When the title and description of the post don’t mention the database used, this “map” is kind of misleading.

          2. The Scopus dataset may not be a problem, actually; publications serve as incomplete but easily accessible proxies for the number of research activity, which is rather difficult to measure in reality. I used bibliometric data from Scopus myself, which revealed the general trend, and only a few inconsistencies.
            i say, more projects like this should be discussed. cheers!

  10. Olivier’s map depicts “how scientific collaboration actually works worldwide” well, despite methodological considerations of any kinds (e.g., scaling, data base, country population).
    It is a mapamundi, but not a city map …nor a Science or Technology Roadmap. No one will make any decision on Science & Technology Policy or budget on the basis of a mapamundi.
    Do some one have a much more accurate picture?

      1. And you succeeded famously. I love that a visualization can send several “someones” down into the trenches of “how you should have done it,” meanwhile missing the forest for the trees.
        Fantastic job.

          1. That is!!! I completely agree with you!!!
            Olivier’s map is a very-well-done big picture to see the forest, but not the trees. Of course the map could be improved by taking into account the issues like country population or Springer’s data bases that those who “went down into the trenches of “how you should have done it”” already mentioned. But it doesn`t matter much now.
            I aimed to address that:
            1) Olivier’s map is good enough for the purpose it was created: to see the forest. Olivier’s map IS NOT a S&T Roadmap for specific decision-making, but IT IS an amazingly powerful tool for the analysis of how the very vast majority of the world’s S&T collaboration is going on.
            2) the “someones” who supposedly know how to do Olivier’s work much better, or want to see the trees at Florida in the US or Dar es Salaam in Tanzania: please, do it!
            FOR THE AVOIDANCE OF DOUBT: great job Olivier!!!
            I’ll be proud to be in contact with you.

  11. Very nice map, thank you!
    I have a little suggestion based on fact that citation is not symmetric like friendship in facebook — make lines painted with gradient, for example, blue-red. Where blue will represent where citation made and red where cited paper published or vice versa.

  12. Is it possible for you to make the bilateral data accesible (Country1-Country2-SomeMeasuresOfCollaborationVolume). This is good bilateral information that would certainly be useful for the academic community.
    This is a wonderful work!
    Cesar A. Hidalgo (MIT)

    1. Thanks. I would imagine that this kind of data is already available (maybe not in a dataset form) from scientific articles. I’ll see what I can do because it’s not my data.
      In the meantime, you might find this type of data journals like “Scientometrics” or the ISSI proceedings.

  13. I am interested in seeing how the same dataset would look on a Pacific-centric map (the kind you see in airports in Asia with the EU on the far left, East Coast US on the far right). Can you do this?

    1. Hmmm… Great idea. I don’t have the time right now to code it.
      But if you want to do it in a quick&dirty way, you could open the image in Photoshop, cut the image in the middle and move the left part of it on the right.

  14. It would be nice to have access to the dataset.
    But I do understand the limit imposed by your data provider.
    But you can at least examine the Java code you’ve written?
    Have you thought about doing a temporal analysis?
    Let me explain: extract data divided by year in order to create an animation that shows how certain areas have grown into the world of scientific research.

  15. I am very interested in constructing a map like this for my thesis to compare scientific collaborations of my institution to institutions that are comparable to mine. Have you developed a tool that with a given data set, a map such as this can result? I think that this would make a great visual tool for me to examine the success of collaboration at my institution.
    Thanks!!

  16. Very impressive. Not least because here in the UK the press and television, when it is mentioned at all, give the impression that only from the USA do we get any sort of scientific activity. This is a refreshing overview. One small point. Only the “high-resolution” image is complete. The others crop New Zealand out of the picture.

  17. This is interesting stuff, but takes no account of the (national/ethnic) identity of researchers.
    For example, we are researching trans-national research networks among mainland Chinese researchers, from different parts of the world. Is there a way to factor that into the mapping exercise, as far as anyone knows?

    1. Thank you for your comment.
      The full address of each researcher is available(including the nationality) in the raw data.
      Unfortunately, I cannot release this raw data as it belongs to Elsevier and the terms of the license are very strict. But I plan to release, maybe this summer, an interactive tool to explore collaborations between countries and other clusters.
      I’ll send you an email when it’s available.

  18. …Super(be) travail! vraiment très impressionnant! J’attends avec impatience de voir le rendu sur une affiche, je me demande si le niveau de détail pourra être conservé… Encore bravo!

    1. J’ai lancé un essai sur bâche plastifiée 1m sur 1m, grosso modo l’Europe en plein centre qui occupe un carré de 50 cm sur 50 cm, c’est très, très joli.

  19. We’re certainly interested on a better understanding and the complex collaborations relationship that takes place among scientist, fields, Universities, Industries, and countries…, that’s why we are here. I guess Oliver, you have started a quite interesting project, regardless the opinions, that I’ve founded most of them positive and to enrich your Map.
    Let us know if we can help in anyway, keep going!

    1. Thanks, I just read it. You would think that before writing a long article like that, they would read my blog post.
      They say that it’s not a good analytical tool, I never said it was… If you do anything serious with that, you’re mad. It’s only a pretty picture for god’s sake… 🙄
      For a serious take on scientific collaboration (and more importantly scale effects), they should read our paper presented at ISSI Durban:
      Eric Archambault, Olivier H. Beauchesne, Grégoire Côté and
      Guillaume Roberge; Scale-Adjusted Metrics of Scientific Collaboration

  20. Fantastic visualisation Olivier.
    I’m a university student looking to create a visualisation based on global shipping container movements. I gather that Paul Butler’s work was created using ‘R’, can I ask what software packages you used to create this?

  21. Beautiful visualizations, Olivier! I’m also interested in the complex networks that support scientific work and find this project to be a fascinating exploration. I’m looking forward to digging into it myself and seeing where it takes you. I was very excited to meet you at the 4S conference and speak about these a little bit. I wonder how this visualization would look if applied to other networks like science contributions on Wikipedia, or Wikipedia as a whole.

  22. Very interesting, have you undertaken any analysis of cross-discipline work? If so I’d be very interested to see it

  23. This picture is amazing, it could be used to illustrate what we are trying to build : “global collaboration accross organizations”.
    Can you suggest a tool to build such map based on collaboration nodes ?
    About teamtown : http://collaborate.com/node/310
    If you want to discuss collaboration opportunities, please let me know.

  24. I placed the image over a map, and am surprised I don’t really see any lines or links associated with Los Alamos National Laboratory (at least less than Santa Fe). Not that many in Albuquerque, either. Northern New Mexico has about 20,000 people (~50% PhDs) working at 2 major national labs — publishing papers. I would expect to see a very bright line between Sandia and Los Alamos National Labs, as they are physically and academically close. This doesn’t really seem to show up at all on your figure.

    1. The map is really more of an art piece rather than an analytical tool.
      If you want to study scientific collaboration in the US, you could use one of my old visualization tool found here: http://olihb.com/2012/01/21/scientific-collaborations-by-metropolitan-statistical-areas/
      You can access it by clicking the link at the bottom of the page. While in the tool, you can select the Los Alamos CBSA and the CBSA with the most scientific collaboration will be Albuquerque.
      I do not work anymore for the company that provided me with the data used in the map, so cannot update/correct it.

Leave a Reply

Your email address will not be published. Required fields are marked *