Hmmm, is there any way for a model to predict / edit its own tokens as it learns?
Kind of a recursive, self-evolving tokenization process?
I’m pretty new to deep learning, so this analogy might be off. But I’m reminded of convolutional layers inside image-based models. Could we have “tokenization layers” in a GPT model?
---
Edit: I asked GPT-4 about my idea.
It said it the comparison between tokenization and convoluational feature detection is a “someawhat accurate” analogy. And it agreed that making the encoder a fixed separate process that doesn’t evolve during training does limit the GPT model in certain ways and introduce quirks.
But it said that it might increase the computational requirements significantly, and that the transformer-architecture doesn’t lend itself to having “tokenization layers”, and it isn’t clear how one could do that.
It did say that there may be ways to work around the main challenges, and that there is some research in this direction.
Kind of a recursive, self-evolving tokenization process?
I’m pretty new to deep learning, so this analogy might be off. But I’m reminded of convolutional layers inside image-based models. Could we have “tokenization layers” in a GPT model?
---
Edit: I asked GPT-4 about my idea.
It said it the comparison between tokenization and convoluational feature detection is a “someawhat accurate” analogy. And it agreed that making the encoder a fixed separate process that doesn’t evolve during training does limit the GPT model in certain ways and introduce quirks.
But it said that it might increase the computational requirements significantly, and that the transformer-architecture doesn’t lend itself to having “tokenization layers”, and it isn’t clear how one could do that.
It did say that there may be ways to work around the main challenges, and that there is some research in this direction.