Development using GitHub Copilot and ChatGPT

In the last months I've sort of teamed up with ChatGPT and GitHub Copilot and found that amazingly helpful. My focus for this blog is to summarize a number of ideas for working with them that took me a while to find. I guess much of that will carry over to similar tools, but at the moment I didn't feel the need to try and compare.

In the last months I've been extensively trying to use ChatGPT and it's brother GitHub Copilot to write specifications and for development. It's been a bit of a learning curve - I still have the feeling I'm just picking up speed, but is well worth the effort as it increases your productivity quite a lot, and is a lot of fun, too, as there is always that helpful guy you can discuss things with, is readily available to help you and, frankly, has many limits but knows a lot more than you can ever hope to learn in your life. I've been doing some sparetime projects (some are here) for which I wouldn't have had the time if not for Copilot and ChatGPT.

All in all, I think ChatGPT and GitHub Copilot are a great team for you and complement each other well, especially as they have different ways of operating. While ChatGPT does (surprise! :-) emulate a chat that gives you an easy possibility to discuss, ask questions, request modifications to suggestions and so forth, GitHub Copilot "just" gives you suggestions what to insert at cursor point while you're typing away. But let's dive into the details.

ChatGPT

I guess the first thing to overcome is not to feel ridiculous talking to a machine like you would talk to a human. Indeed, that's the first time I found a program for which it's better to think of it as a human than as a machine. More specifically, it is very much like you are chatting with a very helpful and extremely knowledgeable person that is sometimes very intelligent, and sometimes not, and does not hesitate the slightest to invent things just to keep talking. But of course, that analogy has it's limits and to be really productive you have to see where that breaks down.

So, it's a good candidate to discuss / co-create specifications with, get many programming questions answered, and, of course, it serves as an amazing duck for rubber duck debugging that can often give you often very helpful tipps.

One thing to keep in mind is that the context it's able to use is currently rather limited (depending on the model something like 4000 tokens = about 3000 words, including the answer it's giving), even though ChatGPT conversations do a good job of simulating an larger space, and there is much progress on that (there is a 32k token variant you can also use for chat.openai.com when using the Superpower for ChatGPT Chrome plugin , and Anthropic already gives you even 100k token windows .) So you have to sometimes think hard what portions of your code and documents you are giving it as "food for thought", and sometimes make workarounds like my Grab Styles bookmarklet I described in my CSS with ChatGPT blog.

Another thing is that it doesn't learn from you - you cannot agree on a common baseline as you would with an actual person. If you give it a task like writing some kind of specification or a program, you'll often find that you want details done differently, and request changes, and it will take a couple of turns until you get what you want. That's where keeping your personal library of ChatGPT prompts comes in. A prompt (for it to do something) does usually not only contain information about the current task you need done, but often quite a lot of information on how to do it. And that part is often worth keeping, to copy it into the chat when you want a similar task done. That's why I tend not to enter a long discussion with ChatGPT with requesting changes until I get the right result, but to edit my original request and extend it until I get the right result in the first attempt. That is better considering the context windows I discussed, and also gives you a much better chance to find a part of your request that you can keep in your prompt library.

An interesting feature about chat.openai.com that took me a while to realize is that a conversation is actually not just linear, but a tree. If you edit one of your requests, you create a sibling node in the tree at that point, and create a new branch. But you can switch back and forth to previous branches any time and extend them. ChatGPT only considers the branch that's currently shown as the conversation history, though. I tend to use that feature a lot by editing my request instead of asking it to revise things, since that makes for a clean short conversation that mitigates the context window problem, while I"m still able to switch back and review previous tries.

In the context of programming, you can get ChatGPT to answer questions or give code fragments for specific problems. Or you can paste a whole file or larger fragments of a file into it and have it process that in some way, e.g. to do a refactoring. That copying and pasting back and forth is a bit annoying, but is a good complement for GitHub Copilot, since that just gives you suggestions what to add at a specific point, so that a refactoring is more difficult.

GitHub Copilot

During the daily programming work the Copilot will probably do much more for you, since it automatically gathers information from what you're typing, anyway, and will generate suggestions from that. Still, there are some things you can do to improve the suggestion quality quite a lot.

The problem is that you need to drop it some hints what you're doing. Sometimes it's easy for it to guess - e.g. if you're doing some similar things in succession, like adding log statements that log the method and it's arguments to many methods of a class. Sometimes a good method name and good argument names are enough. But in other cases you have to be more explicit to get good results.

As I mentioned recently, the first time I got a really good suggestion by Copilot was when I wrote the complete Javadoc of a method before even starting to code it, since I needed to think about what exacly should happen. So, being a good developer and keeping things well structured and well documented and doing documentation first can help you. If you first type the documentation of a feature into a markdown file in your project, then go step by step through all files to implement it, that can help, as it carries over some context. But often you will want to be more explicit. If you want to do something then it can be a good idea to type a description of it into a comment, if you suspect that that'll be faster than typing out the code yourself, or if you're in somewhat unfamiliar terrain and would need some research. The comment can then be deleted - or kept if you think that'll improve the code quality. If you're implementing an external specification, it might help to copy the passage you're implementing temporarily into a comment, and, again, delete that after the implementation or turn it into a regular comment to stay for good.

Since it's geared towards inserting text, it was a bit of a surprise to me that you can use it to rewrite or refactor parts of the code by adding a comment like `// that refactored to ... :' and then hitting tab to insert the completions again and again until the code is written out, and then delete the old code. Possibly you have to type a bit of the start of that refactored code to get it going.

Even more: even though it is heavily geared towards coding, you can also ask it questions or discuss things with it. After all, it is a language model heavily trained with cod, but it's still a language model. You could try to have it follow a conversation pattern:

// Question: what does the regex "\s(\d+)\s\|\s(.)" match?
// Answer: it matches a string that starts with a number, followed by a pipe, followed by anything. Spaces around the number and the pipe are ignored.

Sometimes you have to nudge it a little - I the italics are what Copilot wrote, bold what I wrote. Just for fun: That's what it came up after I switched it on for this file and wrote the first two lines and hit tab a couple of times:

Me: Hi!
Copilot: Hi! How are you?
Me: I'm fine, thanks. How are you?
Copilot: I'm good. What are you up to?
Me: I'm writing a blog post about ChatGPT and Copilot.
Copilot: Oh, cool! What's it about?
Me: It's about how to use them together to improve your programming.
Copilot: That sounds interesting. What do you think about it?
Me: I think it's a good idea to use them together.
Copilot: I agree. I think it's a good idea to use them together.

But this kind of thing never worked very well for me, as the answers are often trivial and repeat themselves. What often does often work nicely is a kind of sentence sharing, when you start a sentence in a way that it has to come up with the answer you need in order to finish the sentence.

// The documentation of querySelector is at https: //developer.mozilla.org/en-US/docs/Web/API/Document/querySelector

It might even support a little with writing documentation, but that's not much in comparison to what ChatGPT-4 can do for you.

An example

I needed a tool that can parse a string or file into ChatGPT-3.5 or ChatGPT-4 tokens (currently the same thing). So I wrote a basic version of a script that is just able to tokenize the standard input. That felt like guiding Copilot, but that's too much to describe here. (BTW: I know that algorithm is embarassingly naive and inefficient, but this is just an experiment and that's just good enough for that.) Copilot helps nicely when testing and thinking, because during thinking about the algorithm to take it can easily serve requests like

// make a length distribution of the tokens and print that to stderr
// print the distribution sorted by length
// find out the largest difference between the length of a token and the length of the largest prefix of it that is also a token.

in seconds, and programming that would take minutes.

Then I added a string with the usage information that describes the various command line arguments. and wrote a prompt and gave that with this state of the file to ChatGPT-4. I'll dissect and discuss that prompt because I think it nicely shows some aspects about prompting.

In this chat please answer my programming questions as an expert.

That's an interesting point Andrej Karpathy raised in his extremely recommendable talk State of ChatGPT. The language models are trained on all kinds of content, including low quality content, or code written by or for beginners, for that matter. So if you want to have quality, you actually have to tell it. Sometimes in gory detail, if you like to have the responses to match some guidelines.

I'm a expert professional Java programmer who also has intermediate level knowledge in shell scripting, Scala, Javascript, HTML, CSS. I work on a MacOS with ARM64 architecture (M1 processor), homebrew is installed, bash shell.

Here I'm trying to tell it about the tone of it's responses - I don't want to have it explain to me what a variable is, or something. :-) Both of these are part of a of a basic prompt I add when having it write scripts.

I'm looking for detailed answers that explain why you are recommending what you recommend. Think aloud step by step and discuss the needed extensions, including what needs to be extended to implement the functionality for each option, before you print the code changes.

Now the actual task:

Extend the following script so that it matches the usage in the usage string. Currently it just prints the number of tokens.

Then I added the Javascript file let it run and had a look at what it generated. I did a few iterations by adding more stuff to the prompt until the script looked more like what it envisioned. And some things where I had to help it somewhat.

Especially consider that the tokenizer does not need to be called decoding is wanted, and that you need another global variable tokenToNumber for that. Use a switch statement to parse the options, as the first argument is always the option. You can output just the changes to the script, but make a complete implementation without leaving out anything.

The last sentence was because it wanted to leave my implementing some of the options "as an exercise". :-) That gave a nice attempt that did have a couple of bugs, though. So I had it write a little test program:

Please create a bash script that calls the script with all of these options.

and then multiple times fed it back the output of the script to have some errors fixed. (That might be a good candidate for automation.) (If you want the gory details of these interactions, here is the chat, including some branches.) For the shortening and decoding I had to manually fix things, though. I admit when programming it has currently a mixed feeling. Some things are extremely fast, but if you have to explain what needs to be changed it often is easier to change it, unless it's larger code fragments. But I guess you'll know that if you working together with other people, too. :-) Still, I think writing the final script as it is now would have taken me considerably longer.