TensorFlow Ranking

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform. It contains the following components:

Commonly used loss functions including pointwise, pairwise, and listwise losses.
Commonly used ranking metrics like Mean Reciprocal Rank (MRR) and Normalized Discounted Cumulative Gain (NDCG).
Multi-item (also known as groupwise) scoring functions.
LambdaLoss implementation for direct ranking metric optimization.
Unbiased Learning-to-Rank from biased feedback data.

We envision that this library will provide a convenient open platform for hosting and advancing state-of-the-art ranking models based on deep learning techniques, and thus facilitate both academic research as well as industrial applications.

Linux Installation

To build TensorFlow Ranking locally, you will need to install:
- Bazel, an open source build tool.
```
$ sudo apt-get update && sudo apt-get install bazel
```
- Pip, a Python package manager.
```
$ sudo apt-get install python-pip
```
- VirtualEnv, a tool to create isolated Python environments.
```
$ pip install --user virtualenv
```

Clone the TensorFlow Ranking repository.

$ git clone https://github.com/tensorflow/ranking.git

Build TensorFlow Ranking wheel file and store them in /tmp/ranking_pip folder.

$ cd ranking  # The folder which was cloned in Step 2.
$ bazel build //tensorflow_ranking/tools/pip_package:build_pip_package
$ bazel-bin/tensorflow_ranking/tools/pip_package/build_pip_package /tmp/ranking_pip

Install the wheel package using pip. Test in virtualenv, to avoid clash with any system dependencies.

$ ~/.local/bin/virtualenv -p python3 /tmp/tfr
$ source /tmp/tfr/bin/activate
(tfr) $ pip install /tmp/ranking_pip/tensorflow_ranking*.whl

Run all TensorFlow Ranking tests.

(tfr) $ bazel test //tensorflow_ranking/...

Invoke TensorFlow Ranking package in python (within virtualenv).
```
(tfr) $ python -c "import tensorflow_ranking"
```

Example Code

The repository has a runing script over a dummy data set in the LIBSVM format.

Runing Script

Set up the data and directory.

OUTPUT_DIR=/tmp/output && \
TRAIN=tensorflow_ranking/examples/data/train.txt && \
VALI=tensorflow_ranking/examples/data/vali.txt && \
TEST=tensorflow_ranking/examples/data/test.txt

Build and run.

rm -rf $OUTPUT_DIR && \
bazel build -c opt \
tensorflow_ranking/examples/tf_ranking_libsvm_py_binary && \
./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary \
--train_path=$TRAIN \
--vali_path=$VALI \
--test_path=$TEST \
--output_dir=$OUTPUT_DIR \
--num_features=136 \
--num_train_steps=100

TensorBoard

The training results such as loss and metrics can be visualized using Tensorboard.

(Optional) If you are working on remote server, set up port forwarding with this command.
```
$ ssh <remote-server> -L 8888:127.0.0.1:8888
```

Install Tensorboard and invoke it with the following commands.

(tfr) $ pip install tensorboard
(tfr) $ tensorboard --logdir $OUTPUT_DIR

Jupyter Notebook

An example jupyter notebook using the LIBSVM format is available in tensorflow_ranking/examples/tf_ranking_libsvm.ipynb.

To run this notebook, first follow the steps in installation to set up virtualenv environment with tensorflow_ranking package installed.
Install jupyter within virtualenv.
```
(tfr) $ pip install jupyter
```

Start a jupyter notebook instance on remote server.

(tfr) $ jupyter notebook tensorflow_ranking/examples/tf_ranking_libsvm.ipynb \
        --NotebookApp.allow_origin='https://colab.research.google.com' \
        --port=8888

(Optional) If you are working on remote server, set up port forwarding with this command.
```
$ ssh <remote-server> -L 8888:127.0.0.1:8888
```
Running the notebook.
- Start jupyter notebook on your local machine at http://localhost:8888/ and browse to the ipython notebook.
- An alternative is to use colaboratory notebook via colab.research.google.com and open the notebook in the browser. Choose local runtime and link to port 8888.

References

Rama Kumar Pasumarthi, Xuanhui Wang, Cheng Li, Sebastian Bruch, Michael Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, Stephan Wolf. TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank. CoRR abs/1812.00073 (2018)
Qingyao Ai, Xuanhui Wang, Nadav Golbandi, Michael Bendersky, Marc Najork. Learning Groupwise Scoring Functions Using Deep Neural Networks. CoRR abs/1811.04415 (2018)
Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. Learning to Rank with Selection Bias in Personal Search. SIGIR 2016.
Xuanhui Wang, Cheng Li, Nadav Golbandi, Mike Bendersky, Marc Najork. The LambdaLoss Framework for Ranking Metric Optimization. CIKM 2018.

Citation

If you use TensorFlow Ranking in your research and would like to cite it, we suggest you use the following citation:

   @misc{TensorflowRanking2018,
   author = {Rama Kumar Pasumarthi and Xuanhui Wang and Cheng Li and Sebastian Bruch and Michael Bendersky and Marc Najork and Jan Pfeifer and Nadav Golbandi and Rohan Anil and Stephan Wolf},
   title = {TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank},
   year = {2018},
   eprint = {arXiv:1812.00073},
   }