Interview: How Up Hail uses Scrapy to Increase Transparency
Interview: How Up Hail uses Scrapy to Increase Transparency
During the 2016 Collision Conference held in New Orleans, Scrapinghub Content Strategist Cecilia Haynes had the opportunity to interview the brains and the brawn behind Up Hail, the rideshare comparison app.
Avi Wilensky is the Founder of Up Hail
Avi sat down with Cecilia and shared how he and his team use Scrapy and web scraping to help users find the best rideshare and taxi deals in real time.
Fun fact, Up Hail was named one of Mashable’s 11 Most Useful Web Tools of 2014 .Meet Team UpHail
CH:Thanks for meeting with me! Can you share a bit about your background, what your company is, and what you do?
AW: We are team Up Hail and we are a search engine for ground transportation like taxis and ride-hailing services. We are now starting to add public transportation like trains and buses, as well as bike shares. We crawl the web using Scrapy and other tools to gather data about who is giving the best rates for certain destinations.
Scrapy for the win
There’s a lot of data out there, especially public transportation data on different government or public websites. This data is unstructured and a mess and without APIs. Scrapy’s been very useful in gathering it.
CH:How has your rate of growth been so far?
AW:Approximately 100,000 new users a month search our site and app, which is nice and hopefully we will continue to grow. There’s a lot more competition now than when we started, and we’re working really hard to be the leader in this space.
Users come to our site to compare rates and to find the best deals on taxis and ground transportation. They are also interested in finding out if the different service providers are available in their cities. There are many places in the United States and across the world that don’t have these services, so we attract those who want find out more information.
We also crawl and gather a lot of different product attributes such as economy vs. luxury, shared vs. private, how many people each of these options fit, whether they accept cash, and whether you can book in advance.
Giving users transparency on different car services and transportation options is our mission.
CH:By the way, where are you based?
AW:We’re based in midtown Manhattan in a place called A Space Apart . This is run by a very notable web designer and author named Jeffrey Zeldman who has been gracious enough to host us. He also runs A Book Apart, An Event Apart, and A List Apart, which are some of the most popular communities for web developers and designers.Why the Team Members at Up Hail are Scrapy Fans
CH:You have really found some creative applications for Scrapy. I have to ask, why Scrapy ? What do you appreciate about it?
AW:A lot of the sites that we’re crawling are a mess. Especially the government transit ones and local taxi companies. As a framework, Scrapy has a lot of features built in right out the box that are useful for us.
CH:Is there anything in particular that you’re like, “I’m obsessed with this aspect of Scrapy?”
AW:We’re a python shop and Scrapy is the Python library for building web crawlers. That’s primarily why we use it. Of course, Scrapy has such a vibrant ecosystem of developers and it’s just easy to use. The documentation is great and it was super simple to get up and started. It just does the job.We’re grateful that you make such a wonderful tool [Note: We are theoriginal authors and lead maintainers of Scrapy] that is free and open source to startups like us. There’s a lot of companies in your space that are charging a lot of money and making it cost prohibitive to use.
CH:That’s really great to hear! We’re all about open source, so keeping Scrapy forever free is a really important aspect of this approach.On Being a Python Shop
CH:So tell me a bit more about why you’re a Python shop?
AW:Our application runs on the Python Flask framework and we’re using Python libraries to do a lot of the back-end work.
CH:Dare I ask why you’re using Python?
AW:One of the early developers on the project is a Xoogler, and Python is one of Google’s primary languages. He really inspired us to use Python and we just love the language because it’s the philosophy of readability, brevity, and making it simple and powerful enough to get the job done.
I think developer time is scarce and Python makes it faster to deploy, especially for a startup that needs to ship fast.Introducing Scrapy Cloud and the Scrapinghub Platform
CH:May I ask you’ve used our Scrapy Cloud Platform to deploy Scrapy crawlers?
AW:We haven’t tried it out yet. We just found out about Scrapy Cloud, actually.
CH:Really? Where did you hear about us?AW:I listen to a Python podcast [ Talk Python To Me ] which was with Pablo, one of your co-founders. I didn’t know about how Scrapy originated from your co-founders. When I saw your name in the Collision Conference app, I was like, “Oh, I know these guys from the podcast! They’re maintainers of Scrapy.” Now that we know about Scrapy Cloud, we’ll give it a try.
We usually run Scrapy locally or we’ll deploy Scrapy on an EC2 instance on Amazon Web Services.
CH:Yeah, Scrapy Cloud is our forever free production environment that lets you build, deploy, and scale your Scrapy spiders. We’ve actually just included support for Docker. Definitely let me know what you think of Scrapy Cloud when you use it.
AW:Definitely, I’ll have to check it out.Plans for Up Hail’s Expansion
CH:Where are you hoping to grow within the next five years?
AW:That’s a very good question. We’re hoping to, of course, expand to more regions. Right now, we’re in the United States, Canada, and Europe. There’s a lot of other countries that have a tremendous population that we’re not covering. We’d like to add a lot more transportation options into the mix. There’s all these new things like on-demand helicopters and we want to just show users going from point A to point B all their available options. We’re kind of like the Expedia of ground transportation.Also, we’re adding a lot of interesting new things like a scoring system . We’re scoring how rideshare-friendly a city is. New York and San Francisco, of course, get 10s, but maybe over in New Jersey, where there are less options, some cities will get 6 or 7. It depends on how many options are
本文开发（python）相关术语:python基础教程 python多线程 web开发工程师 软件开发工程师 软件开发流程
本文标题：Interview: How Up Hail uses Scrapy to Increase Transparency