Recent - Joey's Page

Large Model Inference on Edge Devices (Master’s Graduation Project)

Duration: Dec. 2024 - Jul. 2025

Company: casia Signify

TL;DR: Enabled efficient on-device video captioning by applying quantization and partitioning to lightweight vision-language models, reducing memory usage while preserving accuracy.

Details

Large scale Vision-Language Models (VLMs) application in video captioning is inhibited due to their high computational and memory footprint. This project investigates methods for enabling efficient on-device inference by leveraging model quantization and segmentation strategies. The work initiates with an evaluation of various lightweight VLMs to shortlist potential candidates based on their availability as open-sources, replicable results, and model sizes. Both post-training quantization and quantization-aware training techniques are considered to shrink model sizes while maintaining captioning effectiveness. Additional partitioning schemes such as encoder-decoder separation and sliding window inference are proposed to partition the inference load among edge devices. Experiment results conducted with MSVD and MSR-VTT datasets verify their effectiveness with considerable savings in memory consumption while incurring minimal sacrifices in terms of effectiveness. The work concludes with considerations for pragmatic deployment together with an outline of directions for futures in hybrid edge-cloud architecture for captioning.

Stack: Visual-Language Models (VLM), Model Quantization, PyTorch, Model Partitioning

Rental Notification Service (MVP)

Duration: Jul. 2023 - Aug. 2023

Role: Independent Developer

I independently built and launched a subscription-based rental listing notification service that gained 103 paying subscribers and generated €2,530 in revenue within two months. Beyond development, I also handled product design and growth marketing, taking the project from idea to a validated business.

Stack: Python, Web Scraping, HTML, Webhook bot, Growth Marketing

Knowledge-Enhanced Text Representation Toolkit for Natural Language Understanding

Duration: Jun. 2022 - Jan. 2023

Organization: casia Chinese Academy of Science

TL;DR: CogKTR is an open-source toolkit for knowledge-enhanced text representation in NLU, unifying acquisition, representation, injection, and application of external knowledge under the Unified Knowledge-Enhanced Paradigm (UniKEP). The toolkit is open-source on GitHub, with an online demo and an instruction video.

Details

Modern NLP starts with text representation, converting discrete texts into continuous embeddings. While pre-trained language models (PLMs) excel at this and have advanced natural language understanding (NLU), they usually rely only on textual context—insufficient for knowledge-intensive tasks. Integrating external knowledge into PLMs can produce richer, more knowledgeable representations. However, existing knowledge-enhanced methods vary greatly, making them hard to reproduce, extend, or combine.

To address this, we introduce CogKTR, a knowledge-enhanced text representation toolkit based on our Unified Knowledge-Enhanced Paradigm (UniKEP). It includes four stages:

Knowledge acquisition
Knowledge representation
Knowledge injection
Knowledge application

CogKTR offers:

Easy-to-use knowledge acquisition interfaces
Multi-source knowledge embeddings
Multiple knowledge-enhanced models
Support for diverse knowledge-intensive NLU tasks

Stack: Python, PyTorch, BERT, Wikidata, WordNet, CogNet

Advertisement Strategy Optimization at Xiaomi

Duration: Mar. 2022 – May 2022

Company: Xiaomi Xiaomi (Internet Business Group)

TL;DR: Analyzed and optimized ad inventory, bidding, and fill strategies using performance metrics (eCPM, CTR, conversion rate), achieving +18% ad inventory, +3% revenue from bidding optimization, and +11% revenue from fill strategy optimization.

Details

Conducted data analysis on ad slot utilization within Xiaomi content feed to identify underutilized inventory, expanding available ad space by 18% and boosting advertiser fill rate and revenue. Designed and ran A/B tests to optimize bidding algorithms and filtering strategies in video feeds, increasing eCPM and advertiser revenue by 3%. Evaluated and adjusted ad prioritization rules for video detail pages, improving eCPM and advertiser revenue by 11% through refined fill strategies.

Stack: Data Analysis, A/B Testing, eCPM Optimization, SQL, Digital Marketing

Java Patch Detection to Cope with API Breaking Changes

Duration: Feb. 2024 - May 2024

Organization: TU Eindhoven

TL;DR: Built a static analysis pipeline using Maracas, Gumtree, e-knife, and SDG to detect Java breaking changes and patches, improving accuracy and efficiency in software evolution.

Details

During the process of software evolution, the dynamic relationship between dependencies and dependants often leads to breaking changes that disrupt software functionality. This report explores a pipeline designed to detect and analyze these breaking changes using a combination of static analysis, code differencing, and code slicing tools. I integrate Maracas, Gumtree, e-knife, and SDG to generate a comprehensive database of broken use slices and their corresponding patches. My methodology offers a structured approach to identifying and understanding the impact of breaking changes on client code. It contributes to the efficiency and accuracy of patch generation in Java software ecosystems.

Stack: Java, Maracas, GumTree, e-knife, SDG, Maven