scribble

Life of Xunzhang

About Story Talk Project Publications Gallery Ideas Email

Project

There are a list of software I wrote or was highly involved in since 2012. I am an enthusiast of open-source, I like Github and do not like Gitlab.

  

Workload Zoo

  • Zoo of real Database workloads
  • Disclose Later

      

ResTune

  • ResTune: Resource Oriented Tuning Boosted by Meta-Learning for Cloud Databases, SIGMOD 2021(To Appear).
  • Minimize the resource cost while guaranteeing the SLA requirement by tuning the database knobs
  • Leverage the tuning experience across workloads and hardware from the cloud provider’s perspective

      

Leaper

  • Leaper: A Learned Prefetcher for Cache Invalidation in LSM-tree based Storage Engines (paper), VLDB 2020.
  • Address LSM Unstable Performance Issue
  • Design Learned Prefetcher, Plug ML into OLTP Systems

      

Symphony

  • Smiplified and unified AI pipeline system
  • An end-to-end and assembling AI Software Platform(Model Building, Data Transformation, Training, Serving and more)
  • Participated in the architecture designs and was highly involved in early development

      

DyNet portfolio_view

  • The Dynamic Neural Network Toolkit
  • Multi-device support, Self-described I/O format for native save/load, ParameterCollection Interface, etc
  • 100+ commits
  • Top #5 contributor

      

Poseidon-Tensorflow

  • Distributed Tensorflow implementation upon Poseidon communication lib
  • Overlap sync(communication)/computation during Mini-batch SGD
  • Make your native Tensorflow model training script distributed with zero code modification!
  • Linear speedup up to 64 GPUs

      

Apache HAWQ portfolio_view

  • Native SQL Engine on Hadoop, Collaborative Project at Pivotal
  • Voted as ASF Committer: 90+ Commits, 50+ JIRAS, 50+ mail threads in dev/user mailing list in 7 months
  • Top #3 committer in the open source community
  • 2016

      

Rec

  • Recommendation Cloud Service based on SQL: combine infrastructures inside Pivotal including HAWQ, MADlib, Cloud Foundry
  • Winner Project of the Hackday Competition✌️
  • With @lma, May 2016

      

Paracel Toolkits

  • Distributed Algorithm Library built on Paracel framework
  • Algorithms include regression(ridge, lasso), classification(lr), clustering(kmeans, Spectral Clustering), graph processing(pagerank), recommendation systems(svd, mf, similarity, decision tree, als) and topic modeling(lda)
  • @Douban.Inc, 2014

      

Plato

  • Realtime Recommendation System based on Factor Model
  • Online/nearline/offline(balltree/regression/matrix factorization) three-layer backends
  • An application of Plato called platoon is used for Douban FM, improve 5% completion rate
  • @Douban.Inc, 2015.

      

Paracel logo

afc

Threp