Using the best thinking models and not burn money (Jul 31st, 2025)

Intro

I’ve been having a dilemma, I don’t want to pay open ai the monthly 20$ subscription, but I want to use the latest and greatest model on some of the problems I have to reduce time. How can I achieve this? I’ve already mentioned in a previous writing last year.. that I use open api keys. I started using mobile and desktop clients with just the api keys. Couple months ago OpenAI also reduced the pricing of o3 model by 80%, I’ve been using the ai very actively, trying to burn my credits as soon as possible.

Deciding on a budget.

I decided to load my account with 6 dollars, on an estimate I thought I’d run out of credits within half a month. I can use o4-mini thinking for most tasks and o3 thinking model for only tough tasks questions. Nowadays I always default to thinking model o4-mini, it’s very cost effective. I told myself, I’m getting this alternate to one KFC ultimate savings bucket.

The UI

The chatgpt UI is clean and gives more control for devs, I like this UI, unfamiliar people might need to get used to it. It even shows the thinking tokens, at the time of writing.. only the openai web portal and the official api client shows the thinking tokens, if you use third party clients (more about this later on this post) for chat api.. we will not get the thinking tokens, we will just see the output.

Using the right models for the right usage,

I exclusively used openAI models alone, I stopped using google AI Studio and claude… more context. I used google AI studio and claude occasionally to validate my solutions.. like my writing from last month about Ratelimiting with firestore.

In chatgpt we have different models, For most of my use cases I use o4-mini, at the time of writing (31st july, 2025).. the latest thinking models from Open AI are O3 and o4 mini-high, I’ve created a table to compare the pricing from openAI portal OpenAI API Pricing

(maybe below pricing details act like a snapshot in future–for me or anyone–looking back 10/20/..etc years from future)

Thinking models

Model	Description	Input Price	Cached Input Price	Output Price
OpenAI o3	Most powerful reasoning model with leading performance on coding, math, science, and vision	$2.00 / 1M tokens	$0.50 / 1M tokens	$8.00 / 1M tokens
OpenAI o4-mini	Faster, cost-efficient reasoning model delivering strong performance on math, coding and vision	$1.10 / 1M tokens	$0.275 / 1M tokens	$4.40 / 1M tokens

Non thinking models

Model	Description	Input Price	Cached Input Price	Output Price
GPT-4.1	Smartest model for complex tasks	$2.00 / 1M tokens	$0.50 / 1M tokens	$8.00 / 1M tokens
GPT-4.1 mini	Affordable model balancing speed and intelligence	$0.40 / 1M tokens	$0.10 / 1M tokens	$1.60 / 1M tokens
GPT-4.1 nano	Fastest, most cost-effective model for low-latency tasks	$0.100 / 1M tokens	$0.025 / 1M tokens	$0.400 / 1M tokens

I always use thinking model.. cuz I think I am biased towards that, I need to change my view on this lol, my thinking is if reasoning model should be always better.. but I have seen videos from bytecloud (youtube) and reddit, that sometimes..it’s not always the case.

Reason I don’t use OpenRouter

The reason I don’t use OpenRouter is because of the money topup with tax, let me explain. When I load money in OpenAI api, I have a trick and can load all the money without tax, but when I try the same thing with OpenRouter.. it doesn’t work. so that’s the only reason I’m not using OpenRouter. One thing to point out is I do use OpenRouter but I’m not loading money into it.. There are free reasoning models on OpenRouter, I’ve tried them like “DeepSeek: DeepSeek V3 0324 (free)” which is completely free and works great, but it’s just for testing.

About perplexity,

Perplexity made a stunt in India as far as I know.. I don’t know about other countries, they partnered with Airtel (a telecommunications company), they made a offer for all existing airtel users of all plans irrespective of prepaid/ postpaid/ broadband/ DTH…etc and no restrictions on plans, that the users can claim perplexity Pro for 1 year at no charge. I always thought OpenAI, Anthropic (claude) & Google all caught up to Perplexity’s features.. and thought its userbase is declining, cuz the main feature of perplexity is the search integration with AI, all others major companies now have that option, I personally think this is a desperate move with fleeting userbase.. To harvest more data and use it? or show userbase is high to investors, though they said they had future funds for many more years.. idk im looking too deep into this. I tried to use it for some applications, it was okayish but not useful all the time. Recently you would have known I switched all my mobile numbers and broadband to jio.. not all, I still had one private number and my mom’s number didn’t get port over cuz of some complication due to I was not there while doing the porting (now she is on a year plan in airtel, still 10 months left I think)..so I used my mom’s number to claim the reward.

But I don’t think they are using the models that is selected in pro mode under the hood.. at least not all the time, because the results are bad, when compared with the result of Open AI o3 (at the time of writing from before), the results form Open AI are better, You have to trust my word on this one, I didn’t save my responses, but maybe I can do that as part of another writing, or maybe you can have a personal test as well.

AI in Office

In office at the time of writing this, I/ you work at VZ, we have gemini pro included (2.5 pro is the one that latest from google gemini) and github copilot. You can use it for any amount without limits. In github copilot there are lots of limitations inside the enterprise.. many public features are not even available inside, like mcp servers and custom chat modes, all the publicly available models, and even sometimes when the output of the AI is being generated.. the message from model gets flagged and the output is removed from the chat.. saying it contains public code and asking us to retry with different prompt. The company (leaders/ managers) is also hardly pushing for using AI for everything.. Even before they are saying.. I’m already on the train, I was on the first members to get access to github copilot and was using ever since. I was one of the beta testers for copilot inside vz as well, I still have that public github public account that we were testing it with. I provided many feedbacks during that time.. don’t know if its valuable for them or not.. during this period I changed entire teams/ organisations and went on promotion etc.

Using OpenAI with third party clients,

I tried several self hosted opensource AI solutions so I can host on my server and then I can configure different API keys, be that may OpenAI, google or Anthropic. I tried running them in local but none of them satisfied my requirements, at the very end I choose to look for clients, apps on mac and iphone and found opencat open cat, its lightweight and really fast and met all my requirements I use the free tier of the app, I can always mod the application as well if I want.. instead of paying monthly/ one time subscriptions.

You can configure different default models, currently I have o4 mini as the default model.

Conclusion

So about the the 6$ I loaded up for AI usage, I got two months of usage out of it.. which is a lot compared to the 20$ monthly price tag by Open AI for chatgpt plus or more costlier tiers. The key is to be smart for on which model to use and how to consume them.. even if over reliant on ai models on certain tasks, use them smart by not exceeding the budget. During these two months I’ve also used it to create images and experimented with audio as well.

I used around 300 requests, that accounts for about 2 million input tokens, again these are a mixture of all models.. like I said I only preferred to use reasoning models.. but a combination of different reasoning models.

Usage based on month-wise, openai’s month-wise usage report is weird, you can’t see the entire graph, you are literally given a sliding window of only one month..where you can see the requests.

My 6$ I was able to use it for 2 months and more.. I only counted till 5$, If I didn’t use it exclusively and the experiment it would have lasted even more.

Personal Update

Continuing to not use the time tracker from last month’s suggestion, its bit of a hassle, in between I tried to write everything physical (pen and paper), because I was forgetting the tasks too focused on one thing and forgetting, As an adult I have a lot of things to do all the time. Gotta reason with my decisions and spend time on things appropriately.

I’ve been getting into reverse engineering in the last 30 days, it started off because I wanted to place designs on AutoCAD, My usage is only minimal, so I don’t want to pay for it (even the civil engineers that live here don’t pay for autocad), but I also don’t want to use cracks from the internet for the latest versions, I can crack it myself with enough effort and use it, the one thing that is different is.. I have to mod the apps on mac (previously I wrote on how to mod apps for macOS, but that was done on intel or x86 architecture from a 2022 post, modding a macos app), that too on arm/ apple silicon. That was a learning curve for me, I also have to bypass the SIP on macOS, disable it by booting into recovery mode/ keeping it active and patching it, debugging applications and cracking is only possible with disabling SIP because macOS won’t allow to debug application if that is turned on. I can do static analysis patch it that way, sign it with my custom certificate and run it..etc, I want to write about it separately.. Long story short, I made a patch that after checking if trial expired, instead of quitting, I made it so..its doesn’t quit. I also blocked internet for the cad software and patched the function that checks if trial license to false. This made the application work and I can use it without internet and all features work, the application does not quit, not does it complain about not having a license – So going back to reverse engineering was lot of fun and I still want to do it..everyday, I’m trying to find time everyday to do more things. One thing I found out was I need cad architecture, its only available on windows, my civil friend told me to use that software to design. I daily drive a apple silicon mac nowadays, so I had to run windows on it, surprisingly there is Parallels, I tried the trial version and I tried windows 11 arm version on it and installed architecture trial (x86 version of this software) inside it and it worked, got no errors. Since I’m running that inside a VM I don’t have to patch it myself and spend another whole lot of time, I can use patches from the internet.. but the one thing that I have to patch is Parallels. currently its on trial period and works well.

Writing helps alleviate a lot of things, future me reading. hope you got through this situation, My current plan is to write more. Also writing without ai support make me write more things. The things I’m writing are more of a personal journal so I don’t care about the quality.