Preferences

mkaszkowiak
Joined 44 karma
A software developer ready to implement your next idea :)

My website and blog: https://kaszkowiak.org/en/ Contact me at: maciej@kaszkowiak.org


  1. Google is killing Android. Along with the side-loading changes, I'm losing the desire to keep using it, as it's no longer an open OS.

    What's the point of those changes? Does Google want to maintain its revenue from Play Store? Feels like a bad long-term decision, especially when Apple is releasing excellent phones.

  2. Happy to see competition in rerankers! Good luck with your product.

    My questions: what languages do your models currently support? Did you perform multilingual benchmarks? Couldn't find an answer on the website

  3. Easier to handle edge-cases - real examples:

    - What if certain rows in a table don't need to be embedded?

    - What if we use a single API key for embedding database rows and user queries and it hits a rate limit - how to prioritize user queries?

    - What if some rows should be vectorized using a different model, depending on an external configuration?

  4. I agree with the author - introducing a vector database often isn't worth the extra complexity.

    Personally, I can vouch for ParadeDB: https://www.paradedb.com/

    It adds extra extensions to PostgreSQL which enable vector indexing, full text search and BM25. Works great and developers are helpful!

    The major difference is that you must generate the embeddings by yourself, but I consider it an upside - to each their own :)

  5. I've also thought about creating a Codenames bot: what if we could use semantic similarity to batch words together? Surely, this can be done using a prebuilt embedding model and clustering!

    After some failed experiments - it performed worse than I thought it will - I've googled the subject, and... it turns out there's a whole paper about ML and codenames :)

    https://arxiv.org/abs/2105.05885 (Playing Codenames with Language Graphs and Word Embeddings) - fun to read

  6. Glad to hear that :) Thanks for developing Marker!
  7. Thanks for answering! In my case, I don't directly use RAG; but rather post-process documents via LLMs to extract a set of specific answers. That's also why I've asked about deduplication - asking LLM to provide an answer from 2 different data sources (invalid unstructured table text & valid structured table contents) quickly ramps up errors.
  8. How do you combine the outputs? Wouldn't there be data duplication between unstructured text and tables?
  9. Did you encounter hidden costs when using Azure Document Intelligence? I processed some PDFs using the paid tier, but the resulting costs were way higher than expected, despite using a prebuilt layout model for only structured extraction. Have no clue what could cause it, no extra details on the billing page. Not sure if the price is misleading, or if it's a skill issue on my part :)
  10. For my use case, overall Marker seems to work pretty well - but it has issues with tables. Merged cells, misplaced headers, and so forth. I'm currently extracting Polish PDFs that are //not// scanned

    When compared to Azure Document Intelligence, Marker is really cheap when self-hosted (assuming you fall under the license requirements), but it does not produce high quality data. YMMV.

  11. This aurora was really powerful! I could see it with a naked eye from a town in central Poland, despite cloudy weather and light pollution. Feels great to finally see it in person
  12. Surprised by the amount of negative comments. Kudos to the team! This is very impressive to accomplish in 24h with a 3-man team.
  13. Thanks for the links! I'll read them :)
  14. I still don't see how developing high quality software is related to one's personal viewpoint on taxes.
  15. I agree that it's problematic, however:

    > welfare/health care system is bad, taxes are not used well

    There's a widespread lack of trust in the Polish government, which decreased even further during the 2015-2023 period. If the money is being funelled to the ruling politicians' families and friends, why willingly pay high taxes? I believe this is an underlying core issue, which would probably take a new generation to repair.

  16. Good questions, no clue. The answer probably lies somewhere between a "badly designed tax system" and "stimulating growth of the IT sector".
  17. Similiar laws are in Poland, except they're not really enforced.

    It's really rare that the tax office would prove a company exists solely for tax optimization. The risk virtually drops to zero if one freelances after the hours and has legitimate invoices with other companies.

    This often causes mismatch between Polish employees who wish to work remotely abroad, and for ex. employeers from the DACH region, where I've heard the laws are strictly enforced. One party claims there is no risk, and the other claims it's too risky :-) (taking other factors aside, such as employee protection, etc.)

  18. How is tax optimization related to the quality of developed software?
  19. Taxes.

    Standard tax rate (on UoP) is 12% up to ~30k USD, the rest is taxed 32%. On top of that, the employer pays a social security fee, its rate rises proportionally to income.

    As an one-person business, you have two most popular options:

    - 12% flat tax rate on income, with a flat rate social security fee; (1)

    - 19% flat tax rate on revenue. The social security fee is dependant on income, but it's less than on UoP. You can write off expenses in this scenario, so the actual tax rate is actually lower. People generally try to write off as much as they can - for example, the tax agency is OK with programmers buying multiple bikes as a means of "transport to clients" ;)

    You can also write off VAT in both scenarios, effectively making a lot of major purchases (desks, chairs, phones, etc) way cheaper. There's also a 5% tax rate, called IP Box, but it's tricky and doesn't apply for every scenario, so I'm taking this aside.

    With the employer spending 5k EUR per month (21,7k PLN), you're left with:

    - 14,6k PLN on UoP

    - 18,5k PLN on 12% tax

    - 16,7k PLN on 19% tax, out of which you can potentially recover 3,9k PLN

    It's easy to see why software developers choose to start a one-person business. It's worth to jump through the hoops to save on taxes.

    (1) There are actually 3 levels dependant on income, but it's lower than the UoP fee for basically most software developers

  20. I've used and had fun with Game Maker at his age - not sure how it fares with 3D though, there might be better alternatives nowadays :)

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal