The Art of Search

Seek and ye shall find! Shall we?!

Aug 16, 2020

Hi there,

We are excited to welcome you to the first edition of our newsletter!

Humans are one of its kind curious species. Seeking information is primal to our nature. Interestingly, we have been trying to make the act of search as easy and efficient as possible since 245 BC, and it continues to date.

This edition is an exploration of how our tools for search have evolved with transformation in information being generated & its new face as data, enabling decision making in various domains.

We hope this edition helps you find a way into your own curiosity and ponder upon your personal search patterns, not just on the internet but in the physical world too!

Abstractions

We never seek things for themselves, but for the search” - Blaise Pascal

One peculiar feature of our search and what Sir Pascal is also trying to convey above is that it is a non-linear process. The journey of finding one thing does not lead to just one result but several interesting outputs, insights, ideas, or seldom resulting in awesome discoveries.

Imagine the compound effect of a single quest, a speck of curiosity, one search on google, how much more information does it create?! Granted, it can also be a distraction at times. Chalk that up as a cost of the quest.

Since the time we have started to document the findings of our curiosity and preserve it, searching through the existing information has become challenging. This problem came up first around 245 BC. When the great library of Alexandria started to grow, it became increasingly difficult for the patrons to locate the relevant material. This led to the invention of the first library catalog ever, ‘The Pinakes’, by Callimachus, a famous poet of that time.

Image Source: Wikipedia

From the times of handwritten scrolls to the current age of Big Data, AI, ML, slowly turning to an Age of Reckoning, the volume and forms of information have kept on increasing. Thanks to cheap storage, all the information which we discovered was useful, we started collecting it more, on a frequent basis and identified it as data.

Another challenge and a peculiar characteristic of the act of search is that funnily, more often or not, we do not know what exactly we are searching for. Sounds poetic, right!? We have some guiding pointers or keywords which help us reach our destination. A basic book index, bibliography, references, etc are a few examples highlighting this behavior.

Let’s look at how technology has contributed to solving the same.

Actuality

Nowadays, when we think search, search engines come to mind. For better and sometimes worse, we have come a long way from being restricted by information from our local newspaper and library.

As the Internet evolved, so did our search capabilities. The 90s and early 2000s saw many search engines competing for mindshare. This can be seen in the timeline here. Fast forward to today and search is synonymous with the big G.*
*For those interested, here is the original research paper which introduces Google’s PageRank algorithm - The Anatomy of a Large-Scale Hypertextual Web Search Engine:

However, recently Google’s search results are slowly getting taken over by hidden Ads and keyword hijacking. Read more in detail here.

Quest for information goes beyond our Googles, Bings, and DuckDuckGos. Different domains, teams, and businesses that require different approaches for searching through their own data.

We search by drawing out connections, and to have our digital data available in a connected fashion makes our search faster and effective. Google’s knowledge graph is a testament to this approach.
Couple of applications that we’ve recently come across and used the connected data approach to search are:

Roam research: a (note-taking) tool for creating a personal knowledge base that is inspired by the Zettelkasten method.
Connected papers: a visual tool to help researchers and applied scientists find academic papers relevant to their field of work.

On a related note, here is a primer about the fascinating world of knowledge graphs.

Accompaniments

If you have ever used a public library, chances are they have records of your visits and the books that you checked out. If the librarian was inclined to, they could go through the catalogs and analyze your interests and perhaps personality. Similarly, by the nature of their function, search tools have the capability to capture our identity and inclinations.

Privacy in the age of the internet and social media is as pertinent a topic as any. You can often find debates raging about it. Below are some interesting links around the topic.

Google Privacy Policy has mirrored the evolution of the internet over the years.
While not restricted to search, here is an interesting account of Mozilla’s mission to fix Internet privacy.
Decentralization is often brought up as an answer to privacy concerns. When researching, we came across YaCy, a software that allows you to set up a decentralized search engine.

Weird Trivia

October 1st, 2015 is regarded as the official day the last library catalog card was printed.
Can you find an anomaly or detect fraud in a dataset that follows no pattern and is as random as the heights of 60 tallest structures in the world? Yes, you can, through Newcomb-Benford law. Want to know more about it, watch episode 4, ‘Digits’ of an amazing series on Netflix named Connected.

If you want to keep receiving future editions, subscribe to the newsletter below. Till then, happy musing!

DataDuet

Discussion about this post