rushter's comments

rushter · on July 4, 2023

Most of my posts are about Python's internals and some security stuff

rushter · on July 9, 2020

Github does not send any extra information.

    "GET /counter.svg HTTP/1.1" 200 565 "-" "github-camo (62249a1c)"

I've reverted the badge.

rushter · on July 9, 2020

I've removed it because it does not display relevant information anymore (someone fetches ten times a second via wget).

I also don't have plans to use it in the future. I wanted to demonstrate the concept.

rushter · on July 9, 2020

You can check it here https://github.com/dwyl/hits

Also https://twitter.com/holman/status/427937383376379904

ianwalter · on July 9, 2020

Oh wow I wish I knew this sooner. They should bring back Bitdeli.

jna_sh · on July 9, 2020

Neat, thanks!

rushter · on July 9, 2020

I agree, but I think such information should be available to all people. Some people will be doing this anyway, and they can hide this by using a transparent pixel.

pc86 · on July 9, 2020

There will also be plenty of people who put images on their profile like this with no interest in tracking the metrics.

stagger87 · on July 9, 2020

What benefits do you feel you get from this over the view counting Github already does for you?

rushter · on July 9, 2020

You can't view profile statistics right now. Personally, my intent was to demonstrate the concept. I don't have plans using it.

rushter · on July 9, 2020

I did some research before writing an article. GitHub started proxying images in 2014, and there are a lot of repositories that use this technique to keep their stats. I think GitHub is OK with that.

jrrrr · on July 9, 2020

Re: proxy security concerns:

>Unlike GitHub, most of them don't even bother proxying the image to hide IP, referrer, and browser agent. If you want to allow external images on your site, you must proxy them and hide everything about a person who requested it. > A person with bad intentions can trick a victim into opening your profile that looks completely legit and detect his IP and a browser.

Can you explain this in more detail? Given a profile host that doesn't proxy, how does that attack work?

nso · on July 9, 2020

Attack?

1. your browser opens image from external server (in this step the server gets your IP and potentially user agent as that's how browsers communicate with servers)

2. there is no step 2

cutemonster · on July 9, 2020

What? My step 2: Go to ip addr and ask your favorite celebrity (whose ip you got) for an autograph and selfie together

f311a · on July 9, 2020

The IP itself can be a very valuable information if you target famous people, politicians, criminals and so on.

Such people usually know, that clicking on random links is not safe.

rushter · on Aug 5, 2019

Author of the article here.

I don't think system allocators are clever enough to process and allocate 100-500k of very small objects each minute when Python is performing something very intensive.

It's a pretty standard way to speedup allocation for dynamic languages. Game developers use similar techniques as well.

I have some stats on Python's allocator:

https://rushter.com/blog/python-object-allocation-statistics...

rurban · on Aug 5, 2019

Can confirm for perl. We are doing the very same. It's a huge win.

Differences:

We never free empty pools. Our arenas are just single linked lists, no need for the prev.

Notes:

For a statically compiled perl the biggest win is to avoid arena allocation (mmap) at all. Data and code is made static. That's around 10-20% of the runtime (for shortrunning programs). Also we rarely free at the end. The OS does it much better than free(). Only the mandatory DESTROY calls and FileIO finalizers are executed.

amelius · on Aug 5, 2019

Can Python actually return memory pages back to the OS, e.g. by sbrk() with negative argument?

I'm currently having a problem with this, where I load a large deep learning model into "CPU" memory then move it to the GPU, but I can't get rid of the memory reserved by the process.

CogitoCogito · on Aug 5, 2019

I can't answer your exact question, but any large allocations/deallocations should be handled by mmap under the hood and in those cases the memory should be returned.

In your case you should first consider the possibility that there is a pointer to your model's objects that is for some reason not being released. It might simply be that even though you are moving your model the GPU and maybe removing any of your own references, there might be internal references to your model's data that is hidden from you. At least something to consider.

edit: To add to this, I'm now quite sure (though I could be wrong!), that whether python does or does not use sbrk with a negative value is beyond the scope of python. Python is making use of malloc/free under the hood:

https://github.com/python/cpython/blob/master/Objects/obmall...

There's some flexibility for wrapping free in different ways in that file, but it seems like it'll basically always be using free at the core. At least on my system in a debugger I just verified that. So if it's true that python by default uses malloc/free, then the question of whether sbrk with a negative number comes into play is more a question of how your libc implements malloc/free.

Of course I might be wrong, but I think that you should probably stop worrying about it at that level and instead look into object references first as I detailed above.

rushter · on Aug 5, 2019

Yes, it can, by calling the free function.

Which framework do you use for deep learning? It can allocate some object on its own.

Can you give me some stats when using a model and after it's no longer in use and can't be accessible? You can get it by calling the sys._debugmallocstats() function.

rushter · on June 14, 2018

My article describes the tricks which are used inside the Python interpreter. Every improvement in the interpreter saves an insane amount of computing power considering its popularity.

rushter · on June 14, 2018

Maybe I've missed something, but you can also get the same id (it's basically an address in the memory) because of the how memory allocation works. There is a special allocator in CPython, which preallocates big chunks of memory and constantly reusing it without allocation overhead. I have an article on this too.

octopope · on June 14, 2018

Ahh.. interesting. Then I guess using the id of the tuples to show that they are using the same object doesn't exactly prove the point.

To your original point (that tuples reallocate based on length) I see that if I delete a tuple of length 3 and then create a tuple of length 5 I see the id is immediately changed. So that's correct.

Lists on the other hand seem to keep reallocating the same address in my limited test.

Strings seem to behave like tuples. When I delete a string and create a new one it creates a new object with a new address.... unless the strings are of the same length.

Perhaps this is no real revelation, I'm rather new to python and spending my time poking around to see how it works. :)

ccmonnett · on June 15, 2018

> Then I guess using the id of the tuples to show that they are using the same object doesn't exactly prove the point

Without the `del a`, it does, because they both have active references. If they were unique objects, we'd see a unique ID for `b` as long as a reference to `a` is active.

> Strings seem to behave like tuples. When I delete a string and create a new one it creates a new object with a new address.... unless the strings are of the same length.

String _objects_ (as opposed to variables referring to them) are immutable in Python. They tend to be allocated anew, but for optimization reasons you can end up with cases where the string objects have the same ID. Like here:

>>> a = 'asdf' >>> id(a) 4389881424 >>> a = 'qwerty' >>> id(a) 4395015672 >>> a = 'asdf' >>> id(a) 4389881424

'asdf' has the same ID with no deling involved because its object wasn't GC'd yet.

Below are some relevant links if you want to knock yourself out (some do) but I write an awful lot of Python and even for me this is well into the realm of "what happens when..." after a few beers or job interview trivia.

https://rushter.com/blog/python-lists-and-tuples/ http://guilload.com/python-string-interning/

rushter · on June 14, 2018

Thanks, it's a typo, I was referring to the mutable object (list).