Using Python to Serve Big Data on the Web at Cybo, the Global Business Directory
by
Armin C. Stross-Radschinski
—
last modified
Apr 13, 2014 12:16 PM
Contributors:
Nicholas Wilson, Cybo
Creative Commons 3.0 CC-BY-SA
Python serves Cybo in both the delivery of data on the web, to activities required in the management of big data.
Introduction
Cybo came into being in 2011 with the initial aspiration to become the largest and most versatile business directory in the world. As of mid 2012 the directory spans 180 countries in over 30 languages, soon to span even more countries in even more languages.
Cybo competes with much larger companies, doing so with free, open-source software and a significantly smaller staff than that of its competitors. Python has been an enormously important tool in Cybo's success thus far.
Clean and Concise
Previous to working at Cybo, the developers had backgrounds in other common programming languages such as C++ and Java. The transition to learning python has been smooth due to it's elegant syntactic nature.
Big Data
Cybo maintains thousands of databases of business information, of which many programmatic checks are done to ensure the quality of our data. <a href=”http://www.postgresql.org/”>PostgreSQL</a> and Python both have been enormously important tools in the management and cleaning of this data.
Extensive Standard Library
The modules available to python are extensive, allowing for python to be used in a number of circumstances. From validating millions of websites to extensive use of regular expressions to sort through various strings in the data, python's built-in modules serve Cybo's many needs.
Great Community Support
Python is actively used and supported by a large internet community. Their extremely active IRC room (#python on irc.freenode.net) can also be quite helpful. As well as this official channel, many of the individual modules available to python have separate channels for more specific support.
Python as a Web Platform
Cybo uses django, a MVC (model-view-controller) web framework written in python. The platform is fast and connects directly to our data servers allowing Cybo to serve hundreds of thousands of pages a day.
An International Language
Due to the vastness of the python community and its open-source nature, python always has international concerns on the table. As Cybo expands its global reach, internationalization of the website is absolutely crucial. Python's numerous modules allow for Cybo to easily serve the site in a multitude of languages.
We use python almost exclusively when it comes to delivering web content, to the extensive processing of our data. The resources available online and the community of support for python is exceptional and we (all) here at Cybo have come to appreciate the language. Without it, we would not have been able to as easily build what we have, and it has opened the doors to much more that we plan to do with our global directory.