In the case of supervised Discovering, the trainers played each side: the consumer and the AI assistant. inside the reinforcement learning phase, human trainers initially rated responses the product had designed inside https://chatgpt-openia.net/login