Portrait

Hi, I'm Joseph!

I'm a full-stack data scientist and junior quant researcher with a passion for building end-to-end solutions. From back-end (Python) to front-end (TypeScript/JavaScript, HTML/CSS), and from databases (MongoDB, SQL) to DevOps (Docker, CI/CD, Microservice), I thrive on turning complex problems into scalable systems.

My expertise lies in natural language processing (NLP)—working with semantic embeddings, fine-tuning models, and leveraging vector databases. I also specialize in machine learning techniques like clustering, dimensionality reduction, and classification to uncover insights from data.

Currently, I'm a Junior Quant Researcher at a Buy-Side Quant Hedge Fund, where I focus on applying NLP and machine learning to optimize trading strategies and research workflows.

When I'm not coding or analyzing data, you'll find me sipping tea, reading, or experimenting in the kitchen.

Chicago, Illinois
Open to interesting opportunities!

About

About me

I'm Joseph, a Junior Quant Researcher at Bayesian Capital Management. My journey into data science and junior quant research began with a curiosity for solving complex problems and a love for storytelling through data.

I specialize in natural language processing (NLP) and machine learning, using techniques like semantic embeddings, clustering, and dimensionality reduction to uncover insights. One of my proudest achievements was building an LLM system during my internship that automated data classification, saving researchers hours at a multi-billion dollar hedge fund.

What drives me is the challenge of turning raw data into actionable insights—whether it's optimizing trading strategies, streamlining research workflows, or exploring the latest advancements in AI.

Outside of work, I'm a lifelong learner with interests in history, psychology, and philosophy. You'll often find me sipping tea, reading a book, or experimenting in the kitchen. I believe that curiosity and creativity are the keys to solving the world's most interesting problems.

Let's connect! Reach me at joseph@bayesian.capital.

My Journey

  • 2024 - Junior Quant Researcher at Bayesian Capital Management. Built an AI pipeline for a multi-billion dollar hedge fund, automating data classification and saved hundreds of hours for analyzing each dataset -- forever.
  • 2023 - Founded Anchor AI, developed a Markov chain model for ancient Rome research, and led a data science team for the Arizona Diamondbacks Consultant. Also served as CTO at Cent Startup and Founded Marketing Initiative at Biola QCC.
  • 2022 - Conducted research on the stable marriage problem and engineered an XML parsing system for archaeological research data.
  • 2021 - Discovered my passion for data, math, and finance during a transformative gap year.

Skills

Languages

Python
R
TypeScript
HTML
CSS
JavaScript
Flutter/Dart
C++
SQL

Experience Timeline

Junior Quantitative Researcher & Data Scientist | Bayesian Capital Management

Junior Quantitative Researcher & Data Scientist | Bayesian Capital Management

January 2024 - Present

This is where I work now :D. It's a little secretive right now >:3

Research Lead for Medival Age Stylometry | Biola University

Research Lead for Medival Age Stylometry | Biola University

August 2024 - December 2024

Researched, designed, and built a stylometry pipeline based on a published model (arXiv:2310.11081) to perform author attribution on unknown medieval texts. This pipeline was used to identify contested works by analyzing writing styles, contributing to historical and literary research.

Quantitative Researcher & Data Scientists Intern | Bayesian Capital Management

Quantitative Researcher & Data Scientists Intern | Bayesian Capital Management

June 2024 - August 2024

Developed an AI-powered pipeline for a +$1B AUM Quant Hedge Fund to extract insightful topic from noisy data. The pipeline automates the processing and classification of large financial text datasets, reducing manual workload by 90% and cutting costs by 70% per dataset. Used semantic embeddings, clustering models (HDBSCAN, K-Means), and multi-AI consensus scoring to extract key insights for trading strategies. Built an interactive web platform for exploring data, enhancing research speed and accuracy.

My Projects

Built a Better ATS System

Created a semantic-first ATS system that understands resume context instead of just counting keywords. Used Claude 3.5 for story extraction, VoyageAI for semantic embeddings, and Qdrant for vector search. Built with TypeScript/React frontend and Python/FastAPI backend, containerized with Docker and deployed on DigitalOcean. Try it out at http://192.34.61.136:3000 or check the code at https://github.com/Ensyllis/SemanticATS

Stylometry on Medieval Age Authors

Joined a Research Group where we had to do Stylometry on Roberta and Transformer models. Using Hugging Face Semantic Embedding models (RoBERTa), we had to fine-tun the model, then running it on a virtual machine. Then we used Contrastive Learning to do inference to identify potential unknown texts for Authorship Attribution based on writing style.

Built a Data Pipeline to Mass Email Potential Employers

Created a Data pipeline that Scraped EDGAR SEC Data to find companies holding $100M AUM, then set up a data pipeline using MongoDB, Docker Container, HunterIO Api, Web-Scrapers, Virtual Machines, Digital Ocean and Topic Modeling that enriched the data to find website descriptions, contact information and industry of these companies. This allowed me to filter for specific companies that were aligned with my interest and emailed all of the people working inside these companies. (I got a job offer before I ran this so my infamy is not known)

Project Lead for Democratized Financial Advice

ariadneportfolio.com <- prototype (Do not use for real financial advice) Developed an innovative algorithm designed to democratize financial advice, enabling individuals with limited resources to craft diversified stock portfolios. Utilized machine learning techniques, specifically Sequential Least Squares Programming, to tailor investment strategies to individual risk tolerances and financial goals, making effective financial management accessible to all. Engineered and coded the underlying algorithms, demonstrating full-stack development and expertise in utilizing Python, Scipy, HTML, JavaScript

Built this Cute Personal Website!

I made this website using typescript, nextjs, tailwind css then deployed it onto Vercel with CI/CD. Not super sure what else to add here you feel me? Check around the website for fun!

Friend's Birthday Gift

For my friend I created a cute typescript website that had everything they liked, and some things I am grateful about for them. It is their birthday present so I cannot show >:3 it is just for their eyes only

Analysis of Data Science and Statistics Job Market

Took the California Bureu's Data and analyzed which of the following occupations that Data Science and Statistics Majors tend to go to have the highest average growth and the lowest variance as the same time.

Application Developer for Harpious, an App Designed for Speech-Impaired Individuals

Created a a Flutter-based app wrapping an Elevenlabs API inside a Flutter application to generate realistic synthetic voice output. Demonstrated proficiency in utilizing Dart, Flutter, and RESTful APIs to deliver assistive technology solutions.

Factor Analysis of Women's workforce participation using Logistic Regression with In and Out of Sample training.

Using the Mr087 data of women income, we were capable of predicting the greatest factors that influenced women labor participation finally noting that the highest predictor of women labor participation is extra household income, if we discounted that factor the the 2nd highest factor is education concluding the research by suggesting that investing in woman's education can lead to higher women labor participation.

Financial Pitch - Quantitative Investing Strategy

Using pandas dataframe I was able to analyze the fundamentals of a stock (P/E, Cashflow, etc.) Rank it in a percentile to get the Top 50 stocks then used Markowitz Portfolio Theory to find the best stocks from the top 50. I pitched the results to my University's investment club.

Automated Messaging Scripts

Programmed a script that was capable of automating Whatsapp messages and Emails as well.

Financial Portfolio Optimization

Applied Markowitz's Modern Portfolio Theory in Python to design algorithms that balance risk and return.

Markdown Converted Website

Made a small data-pipeline which took in markdown files from Obsidian, and then rendered it to typescript. Supposed to be used for non-technical students to show off their artwork and portfolio.

Get to Know Me Better

What drives you in your work?

How would colleagues describe your work style?

Tell me about a significant achievement.

What's your approach to risk?

What makes you unique?

What's your biggest challenge?

Why are you in this field?

What's your ideal collaboration style?

Let's Connect

I'd love to chat about opportunities, collaborations, or interesting projects. Schedule a time that works best for you!

Prefer email? Reach me directly at josephliu1127@gmail.com