A tool to automatically generate the input text. It is like an intelligent typist writing correct input text according to different mobile app input scenarios.
We give source code (./source code/). Please note that our LLM uses OpenAI API. Please replace it with your OpenAI key. Thank you very much for your support for our work!
The use of OpenAI API has been described in detail in Readme. The generation method of training data is also given in Readme.
As shown in the following table, it is all our patterns.
| Id | Sample of linguistic patterns/rules | Examples of linguistic patterns/rules |
| Patterns related to input widget: IWPn | ||
| 1 | Please input < widget[n] >, the < widget[n] > is | Please input game name, the game name is |
| 2 | Please input < widget[det+n] >, < widget[det+n] > is | Please input your nickname, your nickname is |
| 3 | Please < widget[v+n] >, the < widget[n] > is | Please search the food, the food is |
| 4 | Please < widget[v] > | Please search |
| 5 | < widget[n] > + |
Your weight is [MASK] kg |
| 6 | < widget[n] > + |
Your age is [MASK] |
| 7 | < widget[prep] > + |
From [MASK] |
| Patterns related to local context: LCPn | ||
| 8 | < widget[prep] > + |
From [MASK], to [MASK] |
| 9 | This input is about < local[n] > | This input is about the NBA team. |
| 10 | This input is about < local[n] >, we need to < local[v+n] > | This input is about one-way flight, we need to search the flight information. |
| 11 | This input is about < local[n] >, please < local[v] > | This input is about your health, please input. |
| 12 | This input is about < local[n] >, we need to input < local[n] > | This input is about one-way train, we need to input the seat map. |
| 13 | This input is about < local[n] >, we need to known it < local[prep] > | This input is about your trip, we need to know it from. |
| Patterns related to global context: GCPn | ||
| 14 | This is < app\ name > app, in its < activity\ name > page, the input category is < input\ category >. | This is a NBA sport app, in its search the NBA team page, the input category is query category. |
| Prompt generation rules | ||
| 1 | < GCPtn > + < LCPtn > + < IWPtn > | This is a my movie app, in its search movie page, the input category is query category. This input is about your favorite move in this year. Please search the movie, the movie is |
| 2 | < GCPtn > + [< LCPtn > + < IWPtn >]{n} | This is a money wallet app, in its personal income page, the input category is numeric category. This input is about your monthly income. Income is [MASK] dollar. This input is about your expenses. Expenses is [MASK] dollar. |
You can get the code and tuning data through our code.
Fine tune your gpt-3 as follows, and the effect is the same.
1.We recommend using our OpenAI command-line interface (CLI). To install this, run
pip install --upgrade openai
2.(The following instructions work for version 0.9.4 and up. Additionally, the OpenAI CLI requires python 3.)
Set your OPENAI_API_KEY environment variable by adding the following line into your shell initialization script (e.g. .bashrc, zshrc, etc.) or running it in the command line before the fine-tuning command:
export OPENAI_API_KEY="<OPENAI_API_KEY>"
3.Prepare training data
Training data is how you teach GPT-3 what you'd like it to say. Your data must be a JSONL document, where each line is a prompt-completion pair corresponding to a training example. You can use CLI data preparation tool to easily convert your data into this file format.
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
- CLI data preparation tool We developed a tool which validates, gives suggestions and reformats your data:
openai tools fine_tunes.prepare_data -f <LOCAL_FILE>
- Create a fine-tuned model
openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m Curie
- After you've started a fine-tune job, it may take some time to complete. Your job may be queued behind other jobs on our system, and training our model can take minutes or hours depending on the model and dataset size. If the event stream is interrupted for any reason, you can resume it by running:
openai api fine_tunes.follow -i <YOUR_FINE_TUNE_JOB_ID>
- Use a fine-tuned model
openai api completions.create -m <FINE_TUNED_MODEL> -p <YOUR_PROMPT>
curl https://api.openai.com/v1/completions \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -H "Content-Type: application/json" \ -d '{"prompt": YOUR_PROMPT, "model": FINE_TUNED_MODEL}'
import openai openai.Completion.create( model=FINE_TUNED_MODEL, prompt=YOUR_PROMPT)
Since the API of gpt-3 contains personal information, we will give our fine tuned API after the double-blind review.
The key words of our approach are shown in the table.
The data construction algorithm code is as follows:
\begin{algorithm}
\caption{Heuristic-based training data construction}
\KwIn{
\KwOut{$con$: constructed input content;}
Traverse
\If{$category==filled\ content$'} { \If{search, add, input, enter' not in
\If{$category==search\ list$'} { \If{search, input' in
\If{$category==`popup\ menu$'}
{
\If{$Tx_2 < Sx_1$}
{
obtain items of Spinner
\If{$category==setting content$'} { \If{setting' in
return
\end{algorithm}
We give the pilot study dataset (./motivation/)
The pilot study dataset for motivation section.
They are the screenshot with text input from Rico, which contains 7000+ screenshots.
Because the storage space of GitHub is limited to 2GB (we use 90% of it), we provide the screenshot, and the rview hierachy file can be downloaded on Rico.
After the double-blind review, we also will upload all of them to my Google drive.
We give experiment dataset (./experiment/)
The experimental dataset for effectiveness evaluation and usefulness evaluation. The first is the apks from effectiveness evaluation, which contains 106 apps, the app information as shown in table.
Because the storage space of GitHub is limited to 2GB, we provide the first 85 apks, and the rest can be downloaded on Google play through the information in the table.
After the double-blind review, we will upload all of them to my Google drive.