Profile: pamelafox - Hacker Neue

pamelafox

Joined Mar 2, 2010 2,064 karma

Principal Cloud Advocate @ Microsoft, focusing on Python! Formerly @ UC Berkeley CS, Khan Academy, Coursera, Google. More about me at www.pamelafox.org

pamelafox Dec 24, 2025 parent

I'm on the Python advocacy team at Microsoft, so I've been experimenting a bit with the new framework. It works pretty well, and is comparable to Langchainv1 and Pydantic-AI, but has tighter integrations with Microsoft-specific technologies. All the frameworks have very similar Agent() interfaces as well as graph-based approaches (Workflow, Langgraph, Graph).
I have a repository here with similar examples across all those frameworks: https://github.com/Azure-Samples/python-ai-agent-frameworks-...
I started comparing their features in more details in a gist, but it's WIP: https://gist.github.com/pamelafox/c6318cb5d367731ce7ec01340e...
I can flesh that out if it's helpful. I find it fascinating to see where agent frameworks converge and diverge. Generally, the frameworks are converging, which is great for developers, since we can learn a concept in one framework and apply it to another, but there are definitely differences as you get into the edge cases and production-level sophistication.
pamelafox Oct 21, 2025 parent

Yes, AI Search has a new agentic retrieval feature that includes synthetic query generation: https://techcommunity.microsoft.com/blog/azure-ai-foundry-bl... You can customize the model used and the max # of queries to generate, so latency depends on those factors, plus the length of the conversation history passed in. The model is usually gpt-4o or gpt-4.1 or the -mini of those, so it's the standard latency for those. A more recent version of that feature also uses the LLM to dynamically decide which of several indices to query, and executes the searches in parallel.
That query generation approach does not extract structured data. I do maintain another RAG template for PostgreSQL that uses function calling to turn the query into a structured query, such that I can construct SQL filters dynamically. Docs here: https://github.com/Azure-Samples/rag-postgres-openai-python/...
I'll ask the search about SPLADE, not sure.
pamelafox Oct 20, 2025 parent

I believe that Azure AI Search currently uses lucene for BM25, hnswlib for vector search, and the Bing re-ranking model for semantic ranking. (So, no, it does not, though features are similar)
pamelafox Oct 20, 2025 parent

I know :( But I think vector DBs and vector search got so hyped that people thought you could switch entirely over to them. Lots of APIs and frameworks also used "vector store" as the shorthand for "retrieval data source", which didn't help.
That's why I write blog posts like https://blog.pamelafox.org/2024/06/vector-search-is-not-enou...
pamelafox Oct 20, 2025 parent

Do you mean that you're using the Copilot indexer for Sharepoint docs? https://learn.microsoft.com/en-us/microsoftsearch/semantic-i...
AI Search team's been working with the Sharepoint team to offer more options, so that devs can get best of both worlds. Might have some stuff ready for Ignite (mid November).
pamelafox Oct 20, 2025 parent

At Microsoft, that's all baked into Azure AI Search - hybrid search does BM25, vector search, and re-ranking, just with setting booleans to true. It also has a new Agentic retrieval feature that does the query rewriting and parallel search execution.
Disclosure: I work at MS and help maintain our most popular open-source RAG template, so I follow the best practices closely: https://github.com/Azure-Samples/azure-search-openai-demo/
So few developers realize that you need more than just vector search, so I still spend many of my talks emphasizing the FULL retrieval stack for RAG. It's also possible to do it on top of other DBs like Postgres, but takes more effort.
pamelafox Oct 17, 2025 parent

I'd like to know as well, so that I can set up a caterpillar cam.
pamelafox Oct 17, 2025 parent

Love this! Relatedly, does anyone have a suggestion for an outdoor solar-powered web camera that I could point at the critters in my garden? I'd love to stream a MonarchCam or MantisCam some day.
pamelafox Oct 13, 2025 parent

Ooo bobcats! I live in the bay area near Tilden Park, and I spent a while on iNaturalist trying to figure out where the bobcats hang out, as my 6 year old is very interested in wild cats. I realized sadly that bobcats are usually out at morning/evening, when we are not in the parks. Still used the bobcat stalking as an excuse to take a walk in Tilden today though.
What's your approach to finding the bobcat locations for your shot?
pamelafox Oct 13, 2025 parent

I like this point, for people hiring DevRel:
"Look in your community. Find users of your product or users of your competitor’s product. "
I'm a current DevRel-er myself, and someone recently reached out looking to fill a DevRel role. I told them that I wouldn't actually be a good fit for their product (a CLI tool, and I'm not as die-hard of a CLI user as other devs), and suggested they look within their current user community. That's not always possible, especially for new products, but if a tool is sufficiently used, it's really nice to bring in someone who's genuinely used and loved the product before starting the role.
My hiring history:
* Google Maps DevRel, 2006-2011: I first used Google Maps in my "summer of mashups", just making all kinds of maps, and even used it in a college research project. By the time I started the role, I knew the API quite well. Still had lots to learn in the GIS space, as I was coming from web dev, but at least I had a lot of project-based knowledge to build on.
* Microsoft, 2023-present: My experience was with VS Code and GitHub, two products that I used extensively for software dev. Admittedly, I'd never used Azure (only Google App Engine and AWS) so I had to train up on that rapidly. My experience with the other clouds has helped me with this MS cloud fortunately.
2 points Sep 20, 2025

Why and how to filter the tools from MCP servers

0 comments pamelafox pamelafox.org
pamelafox Sep 20, 2025 parent

It was fun! Now we still see Wave-iness in other products: Google Docs uses the Operational Transforms (OT) algorithm for collab editing (or at least it did, last I knew), and non-Google products like Notion, Quip, Slack, Loop from Microsoft, all have some overlap.
We struggled with having too many audiences for Wave - were we targeting consumer or enterprise? email or docs replacement? Too much at once.
The APIs were so dang fun though.
pamelafox Sep 20, 2025 parent

Hm, I didn't work on the frontend but I don't particularly remember griping..GWT had been around for ~5 years at that point, so it wasn't super new: https://en.wikipedia.org/wiki/Google_Web_Toolkit
I always personally found it a bit odd, as I preferred straight JS myself, but large companies have to pick some sort of framework for websites, and Google already used Java a fair bit.
pamelafox Sep 19, 2025 parent

I was on the Wave team! Our servers didn't have enough capacity, we launched too soon. I was managing the developer-facing server for API testing, and I had to slowly let developers in to avoid overwhelming it.
pamelafox Sep 12, 2025 parent

How do you determine if the tools access private data? Is it based solely on their tool description (which can be faked) or by trying them in a sandboxed environment or by analyzing the code?
pamelafox Sep 8, 2025 parent

I am giving it a go for parenting advice- “My 5 year old is suddenly very germ concious. Doesnt want to touch things, always washing hands. Do deep research, is this normal?” https://chatgpt.com/share/68be1dbd-187c-8012-98d7-83f710b12b...
The results look reasonable? It’s a good start, given how long it takes to hear back from our doctor on questions like this.
pamelafox Aug 29, 2025 parent

Both humans and coding agents have their strengths and weaknesses, but I've been appreciating help from coding agents, especially with languages or frameworks where I have less expertise, and the agent has more "knowledge", either in its weights or in its ability to more quickly ingest documentation.
One weakness of coding agents is that sometimes all it sees are the codes, and not the outputs. That's why I've been working on agent instructions/tools/MCP servers that empower it with all the same access that I have. For example, this is a custom chat mode for GitHub Copilot in VS Code: https://raw.githubusercontent.com/Azure-Samples/azure-search...
I give it access to run code, run tests and see the output, run the local server and see the output, and use the Playwright MCP tools on that local server. That gives the agent almost every ability that I have - the only tool that it lacks is the breakpoint debugger, as that is not yet exposed to Copilot. I'm hoping it will be in the future, as it would be very interesting to see how an agent would step through and inspect variables.
I've had a lot more success when I actively customize the agent's environment, and then I can collaborate more easily with it.
pamelafox Aug 16, 2025 parent

When you describe subagents, are those single-tool agents, or are they multi-tool agents with their own ability to reflect and iterate? (i.e. how many actual LLM calls does a subagent make?)
pamelafox Aug 11, 2025 parent

I ran bulk evaluations on a RAG scenario and wrote-up the results - discovered interesting differences (gpt-5 loves lists, smart quotes, and admitting it doesn't know).
1 point Aug 11, 2025

GPT-5: Will it RAG?

1 comment pamelafox pamelafox.org
pamelafox Aug 10, 2025 parent

I just ran evaluations of gpt-5 for our RAG scenario and was pleasantly surprised at how often it admitted “ I don’t know” - more than any model I’ve eval’d before. Our prompt does tell it to say it doesnt know if context is missing, so that likely helped, but this is the first model to really adhere to that.
pamelafox Aug 7, 2025 parent

We use text-embedding-3-large, with both quantization and MRL reduction, plus oversampling on the search to compensate for the compression.
pamelafox Aug 7, 2025 parent

I am testing out gpt-5-mini for a RAG scenario, and I'm impressed so far.
I used gpt-5-mini with reasoning_effort="minimal", and that model finally resisted a hallucination that every other model generated.
Screenshot in post here: https://bsky.app/profile/pamelafox.bsky.social/post/3lvtdyvb...
I'll run formal evaluations next.
pamelafox Aug 6, 2025 parent

I would argue that we're passing on 5% of life to the machines, not 100%. By the time bedtime has rolled around, my kids have been home for 5 hours - we have already spent hours reading, playing, parkour'ing, role-playing, painting, inventing, slime'ing, etc. We do manage to often tell a story ourselves (last night, we made the kids tell it!), but I am not going to judge a parent (or myself) for deciding to delegate a fraction of creative energy to a machine.
I was 100% against screens when first having a kid, but now I'm content with kids getting a spectrum of entertainment styles, and for parents to get a break every so often.
pamelafox Aug 6, 2025 parent

Gemini wrote that whole story with a short prompt about a "King Dragon that farts". I assure you that our actual improv'd story is far superior in plot points.
And yes, I was confused too as to how farting would clear away fog.
pamelafox Aug 6, 2025 parent

Lol, yes, the dragon's torso turned into a man. That man does show up earlier in the story - I think perhaps the model so closely associates dragon stories with stories of men, it just desperately wanted to add one in? The text itself never actually mentions the man/dragon/torso.
If Gemini added a reflection step to its book drawing routine, I think the model could easily notice the errors, and generate images to correct them - the errors do not seem unsurmountable.
Given that, I'm assuming Amazon is or will soon be filled with decently illustrated somewhat amusing stories.
pamelafox Aug 6, 2025 parent

Lol, I just tried to get it to draw the story about King Dragon farting, but it could not come up with a picture of a dragon farting - it turned it into fire coming from its mouth instead! It's too far outside its training data.
Link: https://g.co/gemini/share/188609ce3e1f
pamelafox Aug 6, 2025 parent

I think it'd be amazing if I had the energy to make up improv bedtime stories every night. (We have a "King Dragon" improv series happening lately, which involves a lot of farts)
BUT, I don't always have that energy, and I already spend hours a day reading stories to my kids, so I am okay with them spending some fraction of time hearing stories from robots/screens/etc. (Lately, it's "Hey Google, tell a story" if mommy is too busy to read)
I hope we never stop paying amazing children's book illustrators though! I have so many books where I marvel at each page and the ingenuity of the illustrative style.
pamelafox Aug 5, 2025 parent

Yep, 20B model, via Ollama: ollama run gpt-oss:20b
Screenshot here with Ollama running and asitop in other terminal:
https://bsky.app/profile/pamelafox.bsky.social/post/3lvobol3...
pamelafox Aug 5, 2025 parent

I ran it via Ollama, which I assume uses the best way. Screenshot in my post here: https://bsky.app/profile/pamelafox.bsky.social/post/3lvobol3...
I'm still wondering why my MPU usage was so low.. maybe Ollama isn't optimized for running it yet?

This user hasn’t submitted anything.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous