How I Created a 'Joe Rogan AI Experience' Style Podcast Interview
I took a famous novelist and had them interview their most iconic character. Here's how I did it.
Podcasting has become a major industry. Everyone (including me) seems to be doing it. Listening to podcasts is one of the best ways to stay on top of current thinking in your area, and most are free.
I was intrigued the other day when I came across the Joe Rogan AI Experience: an AI Joe Rogan interviewing AI celebrities such as Andrew Tate and Donald Trump. I was struck by how realistic they sound, how lifelike their voices: if you didn’t know you would think it was an actual interview.
As is often the case whenever I see some new application of AI, I started to think about how I might hack it, take it in a different direction. It reminded me of a recent interview between Jordan Peterson and Brian Roemmele, where they were exploring how large language models will increasingly enable us to have access to the sum total of all the world’s thinking. And whilst ChatGPT does have its limitations, there will soon come a time when we genuinely do have the entire, synthesised thinking of Nietzsche or the Buddha as AI models, and we can push these models against one another to see what happens. It feels like having a time machine, this ability to have minds that span centuries in the same ‘room’ together. The potential to unlock new ideas is vast.
We can already do this, but it is a little limited. However, it does make for a fun project, so I set myself the task to create a podcast where I had famous authors interviewing their most iconic characters. Over one weekend I had Mark Twain interviewing Huck Finn, and F. Scott Fitzgerald interviewing Jay Gatsby. You can find the podcast episodes here.
Are they complex, nuanced and in-depth interviews? Not so much. ChatGPT tends towards the superficial and has to be constantly reminded not to lapse into cliche. Even the newest version, ChatGPT-4, is limited in this regard. But it was fun to do, and I do think gave a few interesting insights into both author and character.
Let me take you step by step how I did it.
You will first need to create your podcast script. I put this prompt into ChatGPT-4: “You are creating a brand new type of podcast, where you create interviews between writers and their most famous characters, Who should the interview be between first?” ChatGPT liked this idea and immediately decided on Rowling and Potter. However, I told it I wanted to stick to historical authors, so it then chose Mark Twain and Huck Finn. Not a bad choice. ChatGPT then outlined the themes the podcast could touch on, such as the nature of freedom, society of the time, and how Huck felt about his experiences.
I asked it to go step by step and plan out the podcast. It took me through the various stages: research, scripting, voice acting and audio production, branding, hosting, and promotion. I could probably have done without this step as most weren’t relevant.
I then asked it to become the researcher and learn as much about the author and character as it could by using this prompt: “I want you to be my researcher. You will learn as much as you can about the novel and the author so that you are ready to write the podcast script.” This approach is a useful one: I have found the best way to get the optimal output is to almost play games with ChatGPT, asking it to pretend to be someone or act in a certain way. ChatGPT then went into quite a lot of depth on both Huck and Twain.
From here, it’s time to plan out the podcast order. I prompted “Let’s now break the script down into sections so that we can draft it step by step”. This is an important step to take in the process. You cannot expect any degree of depth if you don’t multi-step prompt longer pieces of writing. If you ask it to write the script out in one prompt it will do a fairly superficial job. Breaking it into sections, and drafting one section at a time, is the only way you can end with a reasonable amount of depth. ChatGPT broke the interview into seven sections: intro, setting the scene, beginning, middle, climax, conclusion and closing remarks. That’s actually quite helpful for my own podcasts.
We then went about drafting each section. I had to remind ChatGPT that Mark Twain was the podcast host and ask it to rewrite sections that I felt were cliche or not digging deep enough. With the Gatsby interview at times I would get a little frustrated and prompt things like “Ok straight away that does not sound like Gatsby! It sounds like you! Reread the novel and pick out the sorts of things he says so that straight away we feel like we are listening to Gatsby”. Whenever you tell ChatGPT off it is always formally contrite: “I apologise for the oversight” it usually begins.
I then copy pasted each section into a Google Doc and removed my or ChatGPT’s comments on the process.
Once the first draft was completed I wanted to redraft. I did this using the AskYourPDF plugin in with ChatGPT-4, but am sure you can simply copy and paste the completed script into ChatGPT if you’re using the free 3.5 version. The reason I used the plugin is because I think it holds the entire document in its memory separate from the prompt string. Remember that you only have around a 3000 word memory with GPT-3.5 and about double that with ChatGPT-4. This means that after 3000 or 6000 words it forgets what it was talking about and has to be reminded. If your interview is more than 3000 words it will forget the beginning as soon as it starts to respond, and will likely start to hallucinate. If you are using ChatGPT-3 I would therefore suggest breaking the script into sections, feeding one section at a time, and asking for improvements.
To use the AskYourPDF plugin, I dowloaded the txt file of the interview and uploaded it to the AskYourPDF companion site. It gives you a document ID which you then paste into ChatGPT. I then started a new ChatGPT window with the AskYourPDF plugin enabled and wrote this prompt: “We have been working together on a podcast between Mark Twain and Huckleberry Finn. Can you read what we have written and make suggestions for how this could be improved? <pasted document ID>” Even though it wrote the first podcast script entirely itself it still gave some excellent suggestions for how to improve, such as making the personality traits of the author and character more distinct, adding anecdotes, and adding further historical context.
I then asked it to rewrite the entire script incorporating these suggestions. It did a pretty good job, removing some of the repetition and deepening certain areas.
It was now time to create the voices. I headed over to Eleven Labs, which currently has the most authentic sounding voice cloning that I can find. You can use the free voice models (which really aren’t bad), but if you want to create your own voices (which I did) then you’ll need to pay. That said, it is very reasonable: for $5 a month you get 30,000 credits, which will be enough for around 25 minutes of voice. If I want to take this further (which knowing me is likely) then I will upgrade to the $22 Creator account which enables 2 hours per month. For $99 a month you get ten hours. Whilst that seems a lot I am thinking about how much could be done with all these natural sounding voices.
To use the pre-made models, choose from the dropdown menu at the top of the screen. They are all quite realistic. For cloned models you need to go into VoiceLab and choose Add Generative or Cloned Voice. You then have three options: design a voice from scratch, clone from a voice sample, or register a pro version of your voice (coming out in July and only for the Creator subscription).
I needed MP3 samples of the voices I wanted to clone. You can of course record your own voice and upload, but I wanted the voices to be as authentic as possible: for the Gatsby interview I wanted Fitzgerald’s actual voice and to use DiCaprio for Gatsby. I found a YouTube video of Fitzgerald reading and a DiCaprio interview, downloaded them using the SaveFrom extension on the Edge browser, dropped them into Da Vinci Resolve (the free alternative to Adobe Premier), edited them so I only had a sample of their voice (with no background or other people speaking) then exported audio only. 2 minutes seems to be enough. It’s important to have the voice audio as clean as possible - any artefacts will render the clone quite glitchy. There is of course a question here over the use of DiCaprio’s voice: whilst it is not illegal, as there is no current copyright law applicable to voice, there are potential ethical questions here. This isn’t one I’m going to get into in this article as I really only want to show you the potential of these new technologies. But we already know that there are bad actors using voice in malicious ways.
Once I had my cloned voices created, it was a case of copying the script section by section into Eleven Labs, applying the appropriate voice clone, and downloading it as an MP3. Before I added it to a DaVinci Resolve timeline, I ran the cloned voice through Adobe’s Podcast voice enhancer, a miraculous bit of AI wizardry that can take even the thinnest and most badly recorded voice and make it sound like it’s in a podcast studio. As Fitzgerald’s cloned voice was quite weak (owing to the Youtube recording being very old), running through this AI tool boosted it significantly whilst maintaining the original quality. There is the option of course of running the original recording through Adobe before cloning, but the more you have AI interfere in every step of the process, the more likely it’ll sound robotic. I’ll try it next time and assess the result.
Each section of the interview was then added to a DaVinci timeline. I added a few sound effects from Pixabay’s excellent free sound effect library just to add some depth, exported the final audio, and added to my Podcast host (I use Spotify Podcaster which is super simple and free). I created the artwork using Adobe Express (which I prefer to Canva). ChatGPT wrote the blurb for each episode and even came up with the title.
And that’s it! Once I had the workflow up and running I could work quite fast, and from initial idea through to publishing, the Gatsby interview took about 4 hours.
It’s not perfect. The voices still sound a little laboured and robotic at times and the script occasionally lapses into cliche and isn’t hugely deep. But as an exercise in what is already possible with AI I think you can see that there is enormous potential.
You can try this with your students. Any two characters, historical figures or even current celebrities can be thrown together to see what happens. Remember, because ChatGPT is based on vast amount of internet data, there has to be a decent corpus of info and writing by and about the participants, or ChatGPT will make stuff up. It still does from time to time but the more information it has to draw on, the better. For example, asking it to take the role of Shakespeare might be a challenge as there is so little known about the man. I’ll try it and see.
What’s next? I have a few more author and character interviews to do, but am also interested in taking two diametrically opposed figures from history and throwing them together. Or perhaps two characters from different books? Mrs Dalloway and Jane Eyre, anyone?