Projects
Graduate Coursework
3.62 GPA, University of Southern California, CS, 2018
Operating Systems :
- Implemented kernel threads, processes, mutexes, scheduler primitives, kshell, interrupt handling, virtual file system, virtual memory management, shadow objects for a general-purpose toy operation system called Weenix.
- Designed a C POSIX Multi threading emulator using mutex and signal handling concepts.
- Emulator was designed to serve packets arriving at a facility(Queue 1) using tokens coming in to a token bucket(Queue 2). These two queues and two servers were used to serve the packets.
- Emulator was designed in a way to improve throughout and delay with statistics printed after each run.
- Project implemented on Ubuntu 14.04 using C programming language.
- Implemented a lookup service to tackle the problem of locating data that is distributed over multiple nodes in the network. A (distributed) database consisting of English words and their definitions was distributed among the 3 backend servers.
- A client issues a dictionary search for a word to a server(Server similar to the amazon web services). The server then issues this request and collects results from each of the three backend servers and performs additional computation, if needed and sends the results back to the monitor and the client.
- Communication between the client,AWS server and the monitor is over a TCP connection while communication between AWS and the three backend servers is over UDP.
- Project was implemented on Ubuntu 16.04 VM using C++.
Machine Learning :
- Implemented a generator network that realistically transforms the gender of a person out of curiosity for what famous Marvel men stars would look like if born with two X chromosomes
- Built a generative model to mimic the writing style of prominent British mathematician, philosopher, prolific writer, and political activist, Bertrand Russell.
Approaches: Trained an LSTM to mimic Russell's style and thoughts. - Used a convolutional neural network for image colorization which turns a grayscale image to a colored image. By converting an image to grayscale, we loose color information, so converting a grayscale image back to a colored version is not an easy job. I used the CIFAR-10 dataset.
Tools: openCV, scikit-learn - Binary classification problem on banknote authentication dataset, Multi-class and Multi-Label Classification Using Support Vector Machines on Anuran Calls dataset and K-Means Clustering on a Multi-Class and Multi-Label Data Set
Approaches: Active, passive learning, Monte Carlo simulation, using linear, Gaussian kernels kernel and L1-penalized SVMs, SMOTE, CH or Gap Statistics or scree plots - Communities and Crime dataset, APS Failure dataset
Approaches: Data imputation techniques, linear,ridge regression, PCR models , boosting tree, multivariate regression tree, L1 penalized gradient boosting tree, XGBoost, random forest, Out of Bag error estimate, ROC, AUC, Weka, SMOTE - An interesting task in machine learning is classification of time series. In this problem, I tried to classify the activities of humans based on time series obtained by a Wireless Sensor Network.
Approaches: Time-domain features, bootsrap confidence interval, binary classification Using Logistic Regression, p-values, backward selection using sklearn.feature selection, stratified cross validation, Python's Recursive Feature Elimination, ROC, AUC, L1-penalized logistic regression, L1 regularization, L1- penalized multinomial regression, Naive Bayes' classifier using both Gaussian and Multinomial priors - Data were extracted from images that were taken from genuine and forged banknote-like specimens. For digitization, an industrial camera usually used for print inspection was used. The final images have 400x 400 pixels. Due to the object lens and distance to the investigated object gray-scale pictures with a resolution of about 660 dpi were gained. Wavelet Transform tool were used to extract features from images.
Approaches: Scatterplots, box plots, Classification using KNN, Learning curve, Euclidean, Minkowski, Manhattan, Chebyshev, Mahalanobis distances, simple linear regression model, association of interactions of predictors with the response using p-values, KNN Regression,
Web Technologies :
- Developed an Android application, which allows users to search for a place using a live JSON API, view its information, save it as a favourite and/or post about it on Twitter. The web application is hosted on Amazon Cloud.
Technologies: Java, XML, Android Studio, Google Places APIs, Picasso, Volley, Amazon Web Services(AWS), Elastic Compute Cloud (EC2), Google App Engine (GAE), PHP/Node.js, Autocomplete for places, RecyclerView, Google Maps SDK, ViewPager - Created a webpage that allows users to search for places using the Google Places API. Once the user clicks on a button to search for place details, webpage should displays several tabs which contain an info table, photos of the place, map and route search form and reviews respectively. Webpage also supports adding places to and removing places from favorites list and posting place info to Twitter.
Technologies: Ajax, JSON and Responsive Desisgn, Bootstrap/Angular/jQuery/Cloud, HTML5, PHP/Node.js, Google Cloud App Engine and Amazon Web Services , Google Places APIs and Yelp APIs, XHTML and CSS, DOM, XML, XMLHttpRequest, Amazon Elastic Compute Cloud (EC2), Autocomplete, User Location, Pagination, Moment.js, Angular Google Maps - Created a webpage that allows you to search for places information using the Google Places API, and the results will be displayed in a tabular format. The page will also provide reviews and photos for the selected place.
Technologies: PHP, Google Places API, JSON parsers in PHP and JavaScript, Google Maps Geocoding API, Google Places API Nearby Search, Place Details, Google Maps JavaScript Library - Wrote an HTML/JavaScript program, which takes the URL of a JSON document containing Trucking companies information, parses the JSON file, and extracts the list of trucking companies, displaying them in a table. The JavaScript program will be embedded in an HTML file so that it can be executed within a browser.
Technologies: JavaScript JSON objects, JSON.parse parser and synchronous XMLHttpRequest - Composed documents directly in HTML and CSS. This is very helpful as it is often necessary to modify existing documents. Also, when writing server-side scripts one must generate HTML.
Technologies: HTML and CSS
Natural Language Processing :
- Cleaned and processed DPS records to classify each incident on a scale of 1-5 based on the level of danger. Classifiers: Bag of Words, Multinomial Naive Bayes and Support Vector based on TF-IDF scores, LSTM Recurrent Neural Networks.
Database Systems :
- Database system architecture; conceptual database models; semantic, object-oriented, logic-based, and relational databases; user and program interfaces; database system implementation; integrity, security, concurrency and recovery.
Information Retrieval and Web Search Engines :
- Devised a search engine to search through 15000 news webpages. PageRank and TF-IDF weighting have been used to rank different search results retrieved from Solr, having features such as spell checking, auto complete and snippets.
Technologies: Java, PHP, Javascript, Python, jQuery, Apache Solr, Lucene, jSoup - Using a web server, I created a web page with a text box which a user can retrieve and then enter a query. The user’s query will be processed by a program at the web server which formats the query and sends it to Solr. Solr will process the query and return some results in JSON format. A program on the web server will re-format the results and present them to the user as any search engine would do. Results are clickable (i.e. open the actual web page on the internet).
Technologies: PHP, NetworkX library, Solr, Lucene, PageRank - Creating an Inverted Index of words occurring in a set of web pages and hands-on experience in GCP App Engine using MapReduce.
- A simple web crawler to measure aspects of a crawl, study the characteristics of the crawl, download web pages from the crawl and gather webpage metadata, all from pre-selected news websites.
- This exercise is about comparing the search results from Google versus Bing, the two leading US search engines. Many search engine comparison studies have been done. All of them use samples of data, some small and some large, so no general conclusions can be drawn. But it is always instructive to see how the two search engines match up, even on a small data set. I followed the process of issuing a set of queries and to evaluate the returned results for relevance. These studies do not seek to answer the ultimate question of which search engine is “best”. Rather we stick to more modest research questions which are: - Which search engine performs best when considering the first five results for a given query?
Analysis of algorithms :
- Explores techniques such as recursion, Fourier transform ordering, dynamic programming for efficient algorithm construction. Examples include arithmetic, algebraic, graph, pattern matching, sorting, searching algorithms.
Undergraduate Coursework
8.91/10 GPA (Distinction), Amrita Vishwa Vidyapeetham University, CS, 2015
Interactive Yoga poses correction :
- Partially devised an articulated human body model that tracks human motion in a video sequence and suggests corrections if the user is performing the yoga pose incorrectly.
Sudoku Solution builder :
- Implemented Sudoku solution builder using human analogy algorithms such as - singles, hidden singles, locked candidates, naked and hidden pairs, triples, quads. Finally, XY-wing would populate the given Sudoku grid with answers.
Tracking algorithm for nimble e-mail system :
- An implementation of an IEEE paper which describes a new tracking algorithm to revamp the current E-mail system in C.
Dictionary and auto-fill application :
- A slight modification of Tire data structure is implemented along with reverse buckets to take care of synonyms in C++. It was used in filling any online form where the word suggestions and auto-fill features assist the user and make life easy.
Medical College management system :
- An interactive system for blood quantity availability enquiry in the medical college will be presented, at the click of an e-mail. It was developed using Java and Microsoft SQL server.
Graphical text editor :
- The application has tools similar to MS Notepad, Notepad++ and also tools which can animate them. It was developed in C.