Blog
UCS Solutions Official Blog

Connecting the dots with Neo4j

Introduction:

Relationships are the most important feature of any database. In a relational database, the foreign keys help us connect different tables. It is very difficult to visualize all the tabular relations and not miss something. This drawback can be mitigated through the use of a Graphical Database. Graphs were first introduced to the world by Leonhard Euler and since then graph theory is applied in almost every industry.

Neo4j is the leading schema-less Graphical Database. The best feature of neo4j is ‘visualization of data’. When you visualize things, you learn more about it. Similarly, when we visualize nodes and the relationships between them, we can gain insights into the data and also derive a better understanding on subtle differences in the data. This is why graph databases are so important. In the following figure, we will see a comparison between a GraphDB, like neo4j, and a traditional Relational Database (RDBMS).

Figure 1: Comparison between GraphDB and RDBMS

Figure 1: Comparison between GraphDB and RDBMS

As we see GraphDB’s performance is much better than traditional RDBMS when we want more joins in the queries. Neo4j does not have any key constraints like primary key in RDBMS but we can add one if we want to. In essence, Neo4j’s search is intelligent: you can ask much more precise and useful questions and get back the most relevant and meaningful information, whereas traditional keyword-based search delivers results that are more random, diluted and lower-quality. Furthermore, the flexibility in data modeling in Neo4j needs a special mention. You never know what people are going to want to know, so you have to be able to repackage the information in different ways on-the-fly. The spontaneity of it and the ability to repackage things on-the-fly – we can do that with Neo4j.

neo4j-2

Figure 2: Data Representation comparison between Neo4j and RDBMS

In order to query any database, there is a specific query language associated with that database. Similarly, to query a Neo4j database, we use Cypher Query Language (CQL). CQL is a declarative query language.

neo4j-3

Figure 3: Cypher Example

Leverage the Power of Neo4j to Gain Insights from Transactional Data:

A good use of Neo4j in commercial scenarios would be to assess shopping activities and transactions in shopping centers and malls. Increase in number of visits is critical in driving long-term viability and business success for both stores and malls, and Neo4j can work as a catalyst for that by accurately tracking what’s happening in these stores and malls.

Transactional Data has a lot of information in it. It can tell us a lot about customers/shoppers/visitors. The information that we could gain from this, can be utilized to build several marketing strategies. The following figure is a simple database model of Transactional Data in Neo4j:

neo4j-4

Figure 4: Data Model for Transaction Data

Understanding the Model:

  • The colored ellipses are nodes and the directed arrows between them are relationships. For e.g. A customer can carry out a Transaction and that transaction would be related to a store and a mall, where it was conducted.
  • The keys mentioned in braces {} are properties whose value could be anything. Neo4j follows a JSON structure to store properties.
  • Theire are some categorical data like Gender, Nationality and Country which are fixed. Similar data can be represented as nodes rather than properties. ‘OTHER CATEGORICAL DATA’ is there to indicate that more categorical data can be mapped as per requirement. This is Neo4j’s flexibility.
  • A ‘Mall’ can also be mapped to an area, a city or a country. ‘Brand’ is mapped to a ‘Brand Category’. These additional relations can give more information.
  • Therefore, increasing the profile of any node, either by properties or by relations, can give some extraordinary and unseen information.

Applications:

A gamut of questions can be asked from the above model. In brief, I will highlight the ones that I found to be the most interesting.

Brand and Mall Affinity:
A customer’s affinity towards a Brand/Store or a Mall can be enunciated with this model. The number of transactions that a customer did, at a particular brand or a mall, will help us find this information. Here, transaction dates are very important.

neo4j-5

Figure 5: Brand Affinity

neo4j-6

Figure 6: Mall Affinity

Customer Segmentation:
This is one of the best feature of neo4j. Customers can be clustered as per location, brand category, mall, brand, gender, etc. In short, clustering is possible with any categorical data. This will even help us to target the audiences with specific and niche selections.

Suppose we want to query, “Get all customers who are of X,Y Nationality and who are Male or Female, who shopped at Brands with Category A,B and C between a certain date range”. For a SQL to run this, it will take a lot of time because it have large number of joins. On the contrary, CQL will be much faster as it traverses through the relationships.

neo4j-7

Figure 7: Customer Segmentation Example

In the above figure, customers 16, 2, 14, 1 and 375 are the results of the mentioned query. This is just one example, possibilities are endless.

Recommendations:
Neo4j is known for its recommendation capability. Today, market giants like Amazon, Netflix, ebay and others are using Graphical Databases to build a recommendation engine. In fact Netflix developed multiple recommendation engines to enhance customer’s experience. Neo4j offers various recommendation strategies like relational recommendation, collaborative filtering, personalized recommendation, etc. Recommendations can even be Real-Time. Let’s see some of them:

Collaborative Filtering based Recommendation:
This method of recommendation is used for customer/shopper recommendation depending on other shopper’s like. For instance, if we ask each shopper to rates a store or rates a mall where he visited, on a scale of 5 or 10, we can find all the similar customers based on their ratings. This will make the Brand or Mall Affinity, more profound. Similarly, the frequency of visit can also be used to determine the likeability.

Now, the recommendation part would be the similarity calculation between those customers whose ratings were similar for the created brands or malls. If these customers are quite similar then maybe we can recommend the likeable unique brands or malls to the other similar customers. Neo4j is one of the best database to build recommendation engines. With neo4j, an additional relationship between similar customers have to be created and ratings or frequencies could be assigned as a property for that relationship. We can then do the calculations based on such property values and do the recommendations.

Personalized Recommendation:
A customers journey can tell us a lot. The time for each visit can explain his shopping behavior. Furthermore, considering his frequency of buying a product or frequency of buying from a certain brand or a brand category can help us recommend him similar brands/malls. For e.g. a common phrase to recommend something to a customer is, “you may also like …”, which can be easily done with Neo4j.

To do this, we can calculate the similarity between each brand within a category or with brands in other matching categories. The match score can be used to decide whether to recommend matched brands to a customer or not. For Real-Time Brand Recommendation, distance could be a good choice. From the layout of the mall, distance between each store could be calculated and stored in neo4j database. If a customer enters some store and his/her data is captured instantly, we can recommend them some of their favorite restaurants instantly. This can be done by customer’s past purchase data that could help us find restaurants that he/she frequently visits after shopping. It can also happen vice versa.

Campaign Response:
Marketing campaigns can be tracked easily with Neo4j. During a campaign, all the customers that received messages/emails, will be connected to the campaign node in the database. After the campaign, we can find out whether those customers did any purchase or not within the campaign period, by checking the customer-brand/mall relation within that period.

Moreover, further dissections can be done. For eg: Data can be filtered as per nationality, gender, date, days or any other attribute. These analysis can be utilized for future campaigns.

Conclusion:

Neo4j has various applications. Neo4j is inline with the latest developments in the technical industry and is being used for Sentiment Analysis, Natural Language Processing and Machine Learning, to name a few.

Companies have adopted Neo4j because it’s a 1,000 times faster than relational databases for working with connected data. Neo4j was the key technology in the Paradise Papers investigation. Companies like Walmart, Airbnb, Ebay, Microsoft, IBM and many others are exploiting the power of Neo4j to discover new business insights, launch new products and services, and attract new customers. The advantages gained by these companies from this transformational discovery has made Neo4j’s graph-based search and discovery a serious business advantage that should not be easily ignored.

To learn more about using Neo4j and how to achieve exceptional value and faster data analysis from your large data sets, please get in touch with us at ucssolutions.com.

Tags: , , , , , , , , , ,

© 2024 Unique Computer Systems. Privacy policy/Terms of use/Cookie policy