The More You Push the AI to Build an App, the More Confused It Gets

I wrote the other day about my first attempt to write a Flutter app for mobile using the AI to help.

Now, let's note that I asked it to show me how do build it step-by-step instead of building it itself. But from one point forward it's the same thing, because after it builds it, you would likely have requests for updates or things that don't quite work out the way you wanted, so you'd still go through a series of feedback and rebuilds until you get something you like, if you are lucky.

Today I got back to this little pet project and see if I could move it forward from where we left off on Sunday.

I didn't close the tab with the conversation with ChatGPT, and it was still there... When I restarted the conversation, the LLM picked it up well from where we left off.

It has this good layout at every stage of the development, describing where we are, the steps that need to be done and what should work if all steps are completed. Here's an example of the summary it makes at the end of each stage:

image.png

But where ChatGPT messes up is on details. I've noticed that on Sunday as well as today.

For example, in a block of code it was recommending me to add for one of the steps, it mixed the logic of operations, adding a line inside a block instead of outside it. When I pointed out the error, it recognized and fixed it (but said "you have an error in your code" or something like that, not that it provided the wrong code snippet; so, they definitely learned well from us how not to assume blame and pass it to others 😄)

I also observed it tends to mix up the name of variables from one step to a future one, so it lacks the consistency to not lose the logical thread of the context (particularly on the variable names, I noticed) from one prompt to another.

It's true, I did stress test it. I mean, it was trying to follow a plan, a structured way to present the steps necessary to build the app. And I kept interrupting... Either asking it to fix bugs, changing priorities between what it proposed and what I wanted to build next, or reminding the AI we don't have a feature it talks about yet (that's another thing - at some point it started to talk about a feature we haven't built yet as if it was already built).

The deeper we go into this (and this is a simple app), the less likely it is for the AI to be useful, in my opinion, on the development of the app. It can help a coder speed up his work through the IDE or for research, maybe can even write pieces of software, but at least on a step-by-step basis, the LLM seems to have issues staying focused on the details, even if it nails the general-context quite well.

One thing we have to consider is that the LLM doesn't actually see the full code, while I can. Maybe it could be a good refresher for the AI if I would provide the full context of the code from time to time. Otherwise, it could be a monumental job for the AI to keep parsing the conversation and see what changes we made to the code, to put them together logically, and remember them correctly.

Posted Using INLEO



0
0
0.000
34 comments
avatar

I also noticed the more you go forward the easier it is to "lose" part of the context or of the old talkings, like it lacks memory of the past even if you don't close the tab as you did

0
0
0.000
avatar

I know it is a problem that happens even for regular prompts. It's probably a memory constraint for inference, since all that information would have to be translated into huge matrices that take up a lot of memory and use serious resources to process, as far as I know.

0
0
0.000
avatar

You might get better results with a local instance, where it can store more stuff (and only your convos, not milions like chatgpt)

0
0
0.000
avatar
(Edited)

You might be right. I haven't delved into this much since I don't have a problem if in the end this project doesn't work. It was just the first thing that popped into my mind to see how easy it would be to implement with the help of AI (and without any proper research on the subject before doing it - just started with the prompt).

0
0
0.000
avatar

You made me wonder if it can help build a AAA game

0
0
0.000
avatar

Probably not yet. But it already can build games that look alike older games.

0
0
0.000
avatar

This is the 'tokens'. All AI LLM's run Inference based on token count and you can only get so much tokens. So if it starts to "run out of memory", the best thing to do is start a NEW chat with updated code or info and continue the conversation there because your tokens would reset

0
0
0.000
avatar

I didn't know that! That's for chatgpt and online ones right? On local you can set a high amount of tokens?

0
0
0.000
avatar

No, I don't think so. Probably the opposite, because you have your own device's memory as limitation. Usually, people who run local LLM models do it with less complex models (or highly specialized ones) because of the inference and memory requirements.

0
0
0.000
avatar

The only idea I have would be expensive from what I understand about most llm fees. That would be to just start over with a prompt to fix any errors, but I think input is were fees are applied and a full program would probably be $.

0
0
0.000
avatar

I don't think it's a matter of fees in this case. I was able to do it so far on the free version without abusing prompts (I even used it without being logged in).

Maybe a pro version would be better. I don't know. But I've heard this issue of losing context happens no matter of the type of user.

Yeah, I could probably get the app working with the basic functionalities I want added, but the more we go in, the more it needs my help to keep track of details and fix the bugs. And I have a background in programming, I can imagine this wouldn't be easy for someone having no idea how to code. I might as well learn and do things myself.

0
0
0.000
avatar
(Edited)

So the underlying problem with context is exponential growth of memory usage to track all the word correlations, and the translations to and from AI speech and English or w/e programing language. The concepts behind AI are very interesting and I and torn between using it and not.

EDIT: Oh btw my friend @cherokee4life aka (witness4all) has been playing around with running a llm, so I am sure he can add more info here.

0
0
0.000
avatar

Yes, I figured the memory usage must be the issue too, the longer the context goes. And as you said, this grows exponentially, not linearly. I'd be curious to see if AI researchers will find a workaround to this issue or if new tech will be invented to solve it.

0
0
0.000
avatar

Who know what the future holds with advances in compression algos and better llms.
Enjoyed the chat have some !PIZZA as brain food.

0
0
0.000
avatar

Likewise! Thanks for dropping by and for feeding me. 😁

0
0
0.000
avatar

So.. running a LLM is next on my list. What I am currently running though is a self hosted AI Image Generator.

But the issue @gadrian is talking about is something I went through A LOT with various AI's when it came to setting up my Hive Witness Node.

I learned a few tricks that may help (if you don't already know).

  1. check the model you are using. Usually the default model for ChatGPT or Gemini is not for coding but for conversational type chat. So switch the model.

image.png

  1. Try out Google Gemini and that 2.5 Pro. I had better luck with coding questions with Gemini than ChatGPT.

  2. If using ChatGPT make sure 'Temporary' is turned OFF. And start a conversation and add bits of code. So what I did was like on day 1 explained what I was trying to build. Then I would copy and paste like 30 lines of code and tell it this is my fist section of code. Then copy past another 30 lines and tell it that it is my second section of code. rinse and repeat until all your code is in that chat.

Then if you come back to it the next day make sure to select that SAME chat and continue to conversation. It will have your chat history to pull from.

But keep in mind each AI uses 'Tokens' for Inference so if you code is MASSIVE you will use a ton of tokens (it comes out to like 1 token is something like 2/3 of 1 word give or take).

But if you are trying to build a application or something similiar.. try Google Firebase Studio

https://studio.firebase.google.com/
https://gemini.google.com/
https://chatgpt.com/

0
0
0.000
avatar

Thanks for your awesome tips! It's true, I haven't checked the model which was selected by default, but a pro model is not something I'd like to pay for.

The other tips are equally important. I already thought to feed the model the entire code so that it won't lose the context as much, but I admit I haven't thought of feeding it broken into sections. That probably helps with the token consumption, right?

0
0
0.000
avatar

So the way I understand Token Consumption is basically all the tokens a Model can use at 1 time. so breaking up the code into smaller section won't help the token consumption so much EXCEPT...

You code can be broken into section. So for example when I was trying to figure out my Witness Node. I would break my chats up into section. So I would have a chat about Linux commands and how to do what I needed to do. Then I had a chat for just downloading the specific file and steps on how to setup a node. Then a chat for what to change in the config file (since that was its own chunk of code).

Splitting it up that way would help for sure but I am not sure that applies to what you are doing. However if you just have like 1000 lines of code right for a app.

Find a good chunk that is sort of self contained like a subsection of code for idk displaying a video. Feed it that code and solve that smaller problem. Then work your way outwards to the bigger context. Granted that may not work with dependencies and all that but you never know until you try :)

0
0
0.000
avatar

Thanks! So, you broke your prompts down into section to make it easier for the AI to focus on specific problems. Makes sense. I would have done the same, had I had to tackle a complex problem.

0
0
0.000
avatar

I hear that people have a lot more success if they use the tools from @codeguidedev on X to create a detailed set of documents and instructions for the AI as guard rails during the coding process

0
0
0.000
avatar

Thanks for sharing! If I get more serious about this, I will check it out. Sounds like a good strategy.

0
0
0.000
avatar

I think the most important question is if building an app is going to give you energy long term.

Currently a lot of people try it, because it is very easy to get started now. I’ve tried a couple things, but I loose interest pretty quickly.

0
0
0.000
avatar

I’ve tried a couple things, but I loose interest pretty quickly.

Haha! I completely understand. I tend to do the same. I don't have the same drive and motivation in this direction as I had when I was younger. Then, I could have spent a week or maybe more trying to get out of a blockage when coding (without being interested of anything else), now I don't have this kind of patience anymore.

But I'll probably still work every now and then on this project to see where it's going, and learn more about the limitations of AI and how to properly use it in the process.

0
0
0.000
avatar

I will try some things from time to time as well. Just to get a feel for it.

Besides that it will be very important to stay up to date with the latest developments, because every part of life is going to be dramatically changed in the coming years.

0
0
0.000
avatar

Same here. I took a break for now, but I'll probably keep trying to refine my understanding of using AI for coding, even though I'll probably not go back to coding like in my youth.

It may be that by the time I figure out how to work with these models to build a project step by step, new models will be more capable and making this entire process seem cumbersome.

0
0
0.000
avatar

Congratulations @gadrian! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)

You got more than 25500 replies.
Your next target is to reach 26000 replies.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

0
0
0.000
avatar

I think its easy for things to get mixed up. Maybe you should tell it to double check the variable names and make sure they are correct before you build things? I think that it needs to double check things sometimes.

0
0
0.000
avatar

Things may be more complex than that. It turns out the longer you talk with it, the more it forgets due to token limits, and you have to start new conversations and tell it the whole context again. In my opinion, that's a major limitation of the LLM models, one that probably won't be overcome easily.

0
0
0.000
avatar

I asked meta Ai to write me something of about 300 words it gave me lesser and I told it that it was lesser, it said yes and apologized and said to write more for me and still gave me also the same thing three times while still apologizing😂

0
0
0.000
avatar

They say LLMs are already really smart if they do IQ tests on them.

My younger niece (she's 9), was asked to write a 100-word introduction to a story and came up with 101 words and she wasn't pleased for the extra word. Very good intro too, made sense in the context!

I think AIs still have a lot of growing up pains to go through.

0
0
0.000