Search engine implementation for CIS 555: Internet and Web Systems. Set of services to crawl, index, rank, query, and visualize assets across the web.


April 2018




Java, jQuery, AWS, S3, Apache Spark, MySQL, DynamoDB


Nihar Patil, Somil Govani, Vibhav Jagwani



Globle was a group project I build out at the end of my Junior year at UPenn. The major components were a web crawler, page indexer, page ranker, and API/web interface. We additionally built out an autocomplete engine for keeping track of the most common queries submitted by users.

Implemented an aggressive in-memory cache and multi-threaded API handler for computing scores to match indexed pages to a user's search query. I also implemented integrations with other data sources like Wikipedia and YouTube for supplementing our search engine results.

All of our tools were deployed across AWS and built on AWS's Java SDKs. I ran performance tests to evaluate performance benefits from sharding work between machines and upgrading machine hardware. This was a great exercise in getting better working with cloud infrastructure and wiring together a variety of services.

The next semester, I was fortunate enough to be a TA for the course.

← Back to all projects