The website www.bluewin.ch asked the new and very famous AI Chatgpt which are the best NHL players in history. Here is ChatGPT's answer: "If I had to pick one player in hockey, it would be Wayne Gretzky. He was simply in a class of his own.
His vision was second to none and his ability to anticipate the movements of his teammates was incredible. He was also one of the most humble and generous players to ever take the ice. His records and stats speak for themselves, but it was his passion and love for the game that set him apart.
He really was The Greatest One." ChatGPT was launched on November 3, 2022 and attracted attention for its detailed and articulate responses, although its accuracy has been criticized. Both supervised learning and reinforcement learning used human instructors to improve model performance.
In the first case, the model was fed with conversations in which the instructors played both sides: the user and the AI-based assistant. In the reinforcement phase, the human instructors first rated the responses the model had created in the previous conversation.
These evaluations were used to create reward models upon which the model was further refined, using several iterations of Proximal Policy Optimization. Proximal Policy Optimization algorithms have an advantage over Trust Region Policy Optimization algorithms by nullifying many of the computationally expensive operations with better performance.
The models were trained in collaboration with Microsoft on their Azure cloud infrastructure. ChatGPT is a large language model developed with machine learning techniques (unsupervised), and optimized with supervised and reinforcement learning techniques, which was developed to be used as a basis for building other models of machine learning.
ChatGPT was trained on OpenAI's Instruct GPT templates, which are the evolution of the GPT-3 templates. GPT Instructs are models where the pre-training has been manually optimized by human trainers. Specifically ChatGPT was developed from a GPT-3.5 using supervised learning and reinforcement learning as model optimization techniques.