-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Description
Scope
- Remove GPT Tokenizer from SDK
- Remove tokenizers from notebook
This code works in a typical C# project, but fails in a notebook, because GPT resources are in a different path.
Console app:
using Microsoft.SemanticKernel.Connectors.AI.OpenAI.Tokenizers;
Console.WriteLine(GPT3Tokenizer.Encode("hello world").Count + " tokens");Output:
2 tokens
C# notebook:
#r "nuget: Microsoft.SemanticKernel.Connectors.AI.OpenAI, 0.19.230804.2-preview"
using Microsoft.SemanticKernel.Connectors.AI.OpenAI.Tokenizers;
Console.WriteLine(GPT3Tokenizer.Encode("hello world").Count + " tokens");Output:
Error: System.IO.FileNotFoundException: vocab.bpe not found, path: '~/.nuget/packages/microsoft.semantickernel.connectors.ai.openai/0.19.230804.2-preview/lib/netstandard2.0/Tokenizers/Settings/vocab.bpe'
at Microsoft.SemanticKernel.Connectors.AI.OpenAI.Tokenizers.Settings.EmbeddedResource.ReadFile(String fileName)
at Microsoft.SemanticKernel.Connectors.AI.OpenAI.Tokenizers.Settings.EmbeddedResource.ReadBytePairEncodingTable()
at Microsoft.SemanticKernel.Connectors.AI.OpenAI.Tokenizers.Settings.GPT3Settings.<>c.<.cctor>b__6_1()
at System.Lazy1.ViaFactory(LazyThreadSafetyMode mode) at System.Lazy1.ExecutionAndPublication(LazyHelper executionAndPublication, Boolean useDefaultConstructor)
at System.Lazy1.CreateValue() at Microsoft.SemanticKernel.Connectors.AI.OpenAI.Tokenizers.Settings.GPT3Settings.get_BpeRanks() at Microsoft.SemanticKernel.Connectors.AI.OpenAI.Tokenizers.GPT3Tokenizer.BytePairEncoding(String token) at Microsoft.SemanticKernel.Connectors.AI.OpenAI.Tokenizers.GPT3Tokenizer.Encode(String text) at Submission#2.<<Initialize>>d__0.MoveNext() --- End of stack trace from previous location --- at Microsoft.CodeAnalysis.Scripting.ScriptExecutionState.RunSubmissionsAsync[TResult](ImmutableArray1 precedingExecutors, Func2 currentExecutor, StrongBox1 exceptionHolderOpt, Func`2 catchExceptionOpt, CancellationToken cancellationToken)
See how the code is looking here:
~/.nuget/packages/microsoft.semantickernel.connectors.ai.openai/0.19.230804.2-preview/
lib/netstandard2.0/Tokenizers/Settings/vocab.bpe
but the file is actually here:
~/.nuget/packages/microsoft.semantickernel.connectors.ai.openai/0.19.230804.2-preview/
contentFiles/any/netstandard2.0/Tokenizers/Settings/vocab.bpe
or here:
~/.nuget/packages/microsoft.semantickernel.connectors.ai.openai/0.19.230804.2-preview/
content/Tokenizers/Settings/vocab.bpe
Metadata
Metadata
Assignees
Labels
Type
Projects
Status