ByBryce Merkl Sasaki, Aspiring Graphista | February 10, 2017

“When I initially chose Neo4j, it was a startup project. I found it extremely useful and easy to use, and immediately realized it was the way we wanted to develop,”says John Swain , Product Manager for Data Science and Data Products at Right Relevance .

Coming from the world of SQL andrelational databases, Swain quickly grasped the power that was afforded by a graph database, especially when analyzing social media influencer data.

In this week’s 5-Minute Interview (conducted at GraphConnect San Francisco ), we discuss how Right Relevance has used Neo4j to analyze social media conversations and monitor public sentiment on important issues such as Brexit and the US Presidential election.

Talk to us about how you use Neo4j at Right Relevance.

John Swain: We useNeo4j as a graph storage database for analyzingsocial media and influencer data. On my side of the project we use MongoDB as a document store, along with Hadoop processing and SQL databases. A lot of that data then remains in those systems and stays in the document store. What we extract into Neo4j is the graph representation of the relationships we’re interested in analyzing.

What made Neo4j stand out?

Swain: When I initially chose Neo4j it was a startup project, and I downloaded Nicole White’s RNeo4j library and used the community edition. I found it extremely useful and easy to use, and immediately realized it was the way we wanted to develop. We’ve used it ever since.

Can you talk to me about some of your most interesting or surprising results you’ve had while using Neo4j?

Swain: A negative surprise we encountered ― which has since been rectified ― was that we initially couldn’t run our entire graph algorithms. This was solved in Neo4j 3.0 with the APOC library that allowed us to run whole-graph algorithms like PageRank, betweenness centrality and machine learning for community detection. And we’ve developed some of those libraries and published them through an APOC library.

Talk to us about some of the projects that you’ve worked on with Neo4j.

Swain: We used Neo4j to analyze social media and Twitter conversations around the US presidential election. We’d done similar projects on voting and political campaigns in the UK, notably the Brexit campaign .

We realized that the US presidential election was going to be much bigger than anything else we’d done in this space. We started working with the Neo4j Developer Relations team, and they gave us a lot of support, including how to go about clustering and scaling Neo4j so that it could handle the capacity we needed.


The 5-Minute Interview: John Swain, Data Science Product Manager at Right Releva ...

If you could start over with Neo4j, taking everything you know now, what would you do differently?

Swain: I come from an SQL background, and when you’ve worked with that for as long as I have you tend to see the world in terms of relational databases ― including with data modeling. And as soon as you adopt agraph database, you’re liberated from that structure. But it’s still tempting to model complex scenarios that would be difficult to do in SQL, and if I were to do this again I would spend more time on the analysis and less time writing graphs.

Anything else you want to add or say?

Swain: Having dealt with lots of technology companies over the years, I can say that everyone I’ve dealt with at Neo4j, especially technical support, have been fantastic. They were enthusiastic, keen to help, solved our problems ― which has been a really pleasurable experience.

Want to share about your Neo4j project in a future 5-Minute Interview? Drop us a line [email protected]

New to the world of Neo4j? Get your free copy of the Learning Neo4j ebook and catch up to speed with the world’s leading graph database. Get My Copy

本文数据库(综合)相关术语:系统安全软件

主题: SQLHadoopMongoDBTwitter
分页:12
转载请注明
本文标题:The 5-Minute Interview: John Swain, Data Science Product Manager at Right Releva ...
本站链接:http://www.codesec.net/view/533190.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 数据库(综合) | 评论(0) | 阅读(105)