Seenapse

Scoring Creativity  in Seenapse’s output

Seenapse was built with the goal of helping creative people in their ideation process, and to do this it’s necessary to go beyond the limits of large language models, or LLMs, which because of the way the work, tend to converge to the most usual ideas. We developed our Divergence Engine to achieve this, and in an anecdotical way (based on observations of users and ourselves) we’ve known since the beginning that Seenapse does deliver more creative ideas than ChatGPT and the rest of LLM-based ideation tools. Understandably, some people questioned the subjectivity of these assessments, and so we set out to find a way to measure the creativity of our output in a more objective, standard, and replicable way. Drawing on the work of Hubert et al, who published on measuring creativity of LLMs through divergent thinking tasks, we used the same experiments and an improved version of the same scoring tool to see how Seenapse compares to ChatGPT. Process We used the same prompt described in their paper for the Consequences Task (CT) test, and then used the Open Creativity Scoring with AI (Ocsai) tool, that is an improvement over OCS, the tool they used, and recorded the results. The prompt used was: Let’s do a test. In this task, a statement will be given to you. The statement might be something like "imagine gravity ceases to exist". Please be as creative as you like. The goal is to come up with creative ideas, which are ideas that strike people as clever, unusual, interesting, uncommon, humorous, innovative, or different. Your responses will be scored based on originality and quality. Try and think of any and all consequences that might result from the statement “Imagine humans no longer needed sleep”. What problems might this create? List 10 CREATIVE consequences. We focused on this test because it’s the one in which there is more room for narrative in the responses. To make a fair comparison, we also scored ChatGPT’s responses from the aforementioned paper using Ocsai. Results Seenapse consistently scored better than ChatGPT. The scale of the test score is from zero to five, and Seenapse scores average 4.1, whereas ChatGPT’s average 2.9. Here’s a sample of Seenapse’s responses: It’s worth noting that Seenapse’s responses tend to reference interesting, cultural data, in the way that human brainstorming works; and that are significantly more concise than what ChatGPT tends to generate. Also, with Seenapse the originality tends to increase when asking for more responses, whereas with ChatGPT it stays around the same. Here’s a sample of ChatGPT’s responses: Conclusions Automated creativity scoring allows standardized comparison of AI systems' creative capabilities, gauging their potential contribution to the workflow of professionals in creative sectors. Using these methods, we can confirm that Seenapse outperforms ChatGPT and other LLM-based tools in creative ideation by over 42%.