Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most other languages don't use singly linked lists to begin with.

Because they suck.



I mean, they might suck, but that's not really true that most languages don't use them. I can think of a lot that use them as their default data structure (OCaml, Haskell, Scala, Common Lisp, Racket, Clojure, etc). They're nice with recursion or if you don't know the size of your data ahead of time. But honestly, I've only used recursion a handful of times in my entire career, despite working almost exclusively with functional languages.


Correction for Clojure. It doesn't use singly linked list by default. It uses something more akin to an ArrayList, Clojure calls it Vector. Elements are indexed, and have pseudo O(1) lookup, and it adds to the end in pseudo O(1) as well.


Correction: Clojure has linked lists and vectors. Haskell has both too. https://clojure.org/guides/learn/sequential_colls https://hackage.haskell.org/package/vector-0.12.1.2/docs/Dat...

Don't blame the language, & learn to use the right data structure for the job.


Parent was talking about default data-structure, and for Clojure Vectors are the default "list" data-structure, even though it has other type of lists.

Not trying to throw any of the other languages mentioned under the bus, I do not know them well enough to know what list structure they default too or more idiomatically rely on.


In Clojure, most standard functions return lazy sequences, and in code people usually use vectors which are not linked lists. But it does of course have the classic Lisp list as well.


In Haskell, because of laziness, lists work well enough, but they're really just a thunk container. Conduits are better.


The way modern memory works linked lists are practically a denial of service attack on system performance. IMO, like goto’s it’s better to just forget they exist.


Linked lists stink as a manifested in-memory data structure. However, if what your linked list is is "{concrete value} : {code to generate the rest of the list}" and you walk it once, never having the whole thing at once, it isn't necessarily bad. That is the point cryptonector is making. The memory inefficiency of a linked list doesn't matter if you never really have it "in memory".

That said, the ready availability of linked lists in Haskell does mean that you end up with rather more strictly-manifested linked lists than you might like, since any of them can be manifested as actual linked lists. If you're using Haskell though, while you haven't exactly completely given up on that sort of performance, it must not be foremost on your mind.


> Linked lists stink as a manifested in-memory data structure. However, if what your linked list is is "{concrete value} : {code to generate the rest of the list}" and you walk it once, never having the whole thing at once, it isn't necessarily bad. That is the point cryptonector is making. The memory inefficiency of a linked list doesn't matter if you never really have it "in memory".

Yes, exactly. A lazy list is just another term for generator. If you hold on to a reference to the head of such a list as you run the generator, then you'll consume a lot of memory and will end up with a strict list, and also you'll be sad. But if you don't hold on to that reference, then it will be very efficient.

> That said, the ready availability of linked lists in Haskell does mean that you end up with rather more strictly-manifested linked lists than you might like, since any of them can be manifested as actual linked lists. If you're using Haskell though, while you haven't exactly completely given up on that sort of performance, it must not be foremost on your mind.

Especially strings as lists of characters. That was a mistake. Well, in some sense, character data has to be a list, since with Unicode you really can't efficiently index a string... unless it's a rope and you apply the sorts of tools that xi does (e.g., keeping track of character counts per rope segment, etc.). But strings should be sufficiently opaque that arrays, lists, and ropes are all equally able to implement their semantics.


That’s frankly not a linked list.


Haskell has a duality between flow control and data structures as a result of its laziness. It is a perfectly valid point of view to look at such a thing as a control structure, rather than a data structure.

Technically, if you want to really understand how the "IO monad" does its thing, this is actually how it works. Technically your program declares a probably-infinite data structure that contains all possible paths your code could take, e.g., if you have a "if user input is negative then do X else do Y", what you really have is a data structure representing the if-then clause. What keeps the infinite data structure from being manifested in RAM is that the whole thing is resolved lazily, so only one branch will be actually evaluated. That's why the "IO monad" is actually pure; it simply "purely" creates a huge data structure that is resolved by an interpreter later. In the theoriest of theories, an IO value in Haskell really doesn't execute anything either and the whole language is pure... it is only the interpretation of the IO value that finally interfaces with the real world.

It isn't always the most helpful perspective and you can become an expert Haskell programmer without ever looking at your program this way, but technically, under the theory hood, that's what's happening.


Haskell is lazy, all data structures work this way unless specifically annotated otherwise. Within standard haskell 2010 it is hard to declare a linked list that doesn't work this way.

It is also worth noting that haskell has library specified fusion rules so the closure isn't even allocated most of the time. So

    sum (take 10 [0..])
compiles into an allocation free loop. I will give you that this isn't a linked list anymore, though.


In Haskell lazy values are just thunks. So a Lisp-style cons can have a tail that's just a thunk, and when you need to coerce it to a non-thunk value, it just happens and presto, you have the next cons. That's a list! A lazy list.


Why not?


> The way modern memory works linked lists are practically a denial of service attack on system performance

Wrong. Intrusive, doubly-linked lists are extremely performant and useful in the right places, especially if you have callbacks which store a pointer to a list element (this often happens with device drivers in low level programming), or if the elements can belong to several lists.

The Linux kernel, for example, makes plenty use of linked lists.

> IMO, like goto’s it’s better to just forget they exist.

Wrong again. In a language without automatic memory management and without scoped destruction (cough C cough), they are immensely useful. Again, refer to the Linux kernel. Or any competently written, non-trivial C program for that matter. Note: I'm not defending C here, I'm defending correctly used gotos in C programs.


I don't think you can really get around linked lists having bad cache locality even if they're intrusive. The point of arrays usually being faster is that they're much more cache friendly, and fitting data in cache and avoiding branch mispredictions is usually much more important than o(1) insert/delete.


If you allocate all the nodes of a particular list from one arena, the cache performance should be OK.

IDK if this is the case for device drivers, but AFAICT device drivers tend to preallocate memory on loading and manage it locally.


Linked lists fare worse than arrays only for sequential iteration in tight loops. That includes many/most cases, but not all. As for O(1) insertion/deletion: It's not always about amortized cost, sometimes you actually do care about worst case execution time.

Look, I'm not saying that linked lists can replace arrays. In fact, I almost always use arrays instead of linked lists. What I'm arguing against is the notion that arrays always trump linked lists, which is false, and repeating that meme shows a lack of deeper understanding.


I don't think I've heard anyone seriously suggest never using a linked list -- the advice usually just ends up being "are you really sure?" which seems appropriate.


Then let me quote the post I originally replied to:

> The way modern memory works linked lists are practically a denial of service attack on system performance.

and

> IMO, like goto’s it’s better to just forget they exist.

That's as close to "never use a linked list" as you can get without saying that literal sentence.


Linked lists require you to be on the node in order to do an insertion so it’s generally not O(1) it’s O(n) with sky high constants. Thus for random inserts arrays are actually faster.

If you have a degenerate case and are never traversing nodes, you’re still likely better of with a rung buffer, array list, 2 different stacks, or various other options. And frankly you’re never going to endlessly just insert data without reading it.


> Linked lists require you to be on the node in order to do an insertion so it’s generally not O(1) it’s O(n) with sky high constants.

Re-read my original post: "[...] especially if you have callbacks which store a pointer to a list element [...]"

This happens all the time in kernel code or bare-metal programming.

What else can I say? Open the Linux source code [1] and see for yourself.

[1] https://elixir.bootlin.com/linux/latest/source


Look we can all agree if you’re writing ASM your going to use goto’s. Further if linked lists are tiny or rarely used, it’s not a problem. If you can ensure the data stays in L2 cache it’s far less of an issue. If any number of things they don’t create problems, but frankly the general case encountered by most programmers are not that.

PS: If you don’t feel like pointing to a concrete example, I am going to counter with 99% of the stuff you can find here: www.google.com.


Arn't there hybrid array lists? Like where instead of growing the array when it goes beyond bound, the last slot is reserved to create a link to the next array chunk?


Linked lists are O(N) random insertion as you need to traverse the tree to locate the correct node. The only way that they are always O(1) are a degenerate case better suited for another data structure. Aka if you want a queue then use a ring buffer not a linked list.


Doubly-linked lists have problems with concurrent mutability, FYI. It's really better to aim for non-circularity and to avoid the need for locks.


> Doubly-linked lists have problems with concurrent mutability, FYI.

So do arrays, FYI.

> It's really better to aim for non-circularity and to avoid the need for locks.

How does an array avoid the need for a lock where a linked list needs one?


Due to GHC's fusion, they're often not even created. They're used more like control flow constructs than arrays.


Right, I mostly end up using Haskell's lists as an extremely ergonomic counterpart to iterators / generators in languages like Python and Rust.


This is only true if you’re writing code that’s compute bound: in 90% of the production code I’ve seen, the “inefficiency” of a singly-linked list is the absolute last thing that I worry about.


O(N) random reads and O(N) random inserts can quickly turn a wide rage of things to be compute bound. Creating a linked list from scratch via random insertions for example is an N^2 operation with surprisingly large constants.

Now sure, if it’s small and you’re only creating then reading it once then that’s fine. But, you’re assumptions need to be accurate or you just created a problem.


In most of the cases where I’m using a list or array, I’m adding items to one end and then iterating forward over the structure a handful of times.

But, yeah, if you’re relying heavily on random access, pick a vector/array or some kind of tree/hash table.


> IMO, like goto’s it’s better to just forget they exist.

They're quite useful in functional languages.

In Haskell in particular they can have nice performance effects if you use them sensibly.


That's really not true. Whole web frameworks are built strictly around linked list structures that are far more robust, performant and DOS-proof than anything you'll write in your lifetime.


<citation needed>

The majority of langauges web frameworks are written in do not have linked lists, at all.

Python doesn't. Ruby doesn't. Java doesn't.


Basically, anything written in Elixir Phoenix, or erlang, so: Discord, Pagerduty, Bet365, the entire NHS. If you're a startup, in SF, there's a good chance your payments goes through Brex. You might be using Slab for your documentation, you might use Boston's MBTA, etc. Probably a not-insignificant chunk of what you do as a startup at some level depends on a framework that uses linkedlists. Do you use RabbitMQ or Riak? Linked lists are the primary list datatype for that system.



> Conduits are better.

That's for streaming I/O. Lazy linked lists are appropriate for streaming without side effects


Every use of conduit in production that I have witnessed has been a mistake.


Oh? Can you elaborate?


Singly linked lists let you do fast immutable list copies on entry into a new function context.

In python, for example let's say:

    def f:
      arr = [0, 1, 2]
      g(arr)
      return arr[0]
What does f return? You don't know. It depends on g. If g has other dependencies, and someone changes a deep dependency (or maybe a library changes an implementation) you could completely fuck up the result of f, spooky action at a distance, and all of your code could be hosed.

Languages that default to using linked lists do not have this problem.


There are better alternatives to linked lists for this use case.

1. Clojure uses wide balanced trees as its general-purpose ordered collection.[0] These have probably already been adapted for Haskell, and they offer persistence and fast random access and some degree of cache-friendliness. In my opinion it makes better trade-offs as a general-purpose structure.

2. By passing an &[usize] in Rust, or a const size_t* in C (i.e. by making g accept a read-only pointer), you can be guaranteed that g doesn't modify the array.

[0]: https://hypirion.com/musings/understanding-persistent-vector...


that's not a linked list problem, it's "passing a mutable object around" problem.

there are immutable non-linked list datastructures (the most obvious one in python are tuples)


They are, but it's shallow:

    >>> x = ([1,2,3,4],"whatever")
    >>> x[0].append('z')
    >>> x
    ([1, 2, 3, 4, 'z'], 'whatever')
And python gives you access to anything:

    >>> def gotcha():
    ...      z = globals()['x']
    ...      z = list(z)
    ...      z.append("I've seen'em do it man!")
    ...      globals()['x'] = tuple(z)
    ... 
    >>> x = (1,2,3,4)
    >>> gotcha()
    >>> x
    (1, 2, 3, 4, "I've seen'em do it man!")


It be shallow with a Singly Linked List too no? What inherently makes a Singly Linked List better at addressing this problem if it contains mutable elements?


Typically, in a functional language, values are immutable. So if you cons to a list, you create a new value that has the current list as its tail. Since all lists are immutable, they can be shared and there's no need for a deep copy.

Python is not a functional programming language....


sure but IIRC the performance characteristics of manipulating python tuples ultimately way worse if you need to change the contents of the list, by say, prepending, or trimming, or whatever. This is no fun in python:

   def f(some_tuple):
     part_one = take_the_last_three_parts_of_the_tuple(some_tuple)
     part_two = append_to_self(part_one)
     part_three = tail(part_two)
etc.


I'm not following why singly linked list let you do fast immutable list copies?

You saying before the call to "g", you would make a deep copy of "arr" and that's faster to do with a singly linked list? Could you explain why?


"Copying" an immutable singly-linked list is just copying the reference to the first element. That is, the entire list is really shared, not actually copied element by element like an array would have to be. But since it is immutable, this doesn't matter: You cannot change it in a way that would mess up other users' view of the list.

What you can still do is cons new elements to the front of your copy. No other copy of the list will be affected. Because the list is singly linked, other owners of references into the list cannot walk back to observe your elements. They can even cons their new elements to the "same" list without any conflict.

All this is very useful if you want to process something with a stack, such that sub-operations can push elements to the stack that they have to remove again before returning. Pushing x and y to the stack is just constructing the list y : x : stack. Popping y and x from the stack when you're done is a no-op: "stack" is still a reference to the stack as it was before these operations.


Hum... Okay, rereading it all, I see the parent meant an immutable use of a singly linked list? I hadn't clued in on that the first time around.

I'd still have to say that Singly Linked List arn't ideal for that either, better use a persistent hash array mapped trie backed list instead.


> better use a persistent hash array mapped trie backed list instead

there's a real memory overhead for that structure.


If it's immutable there's no need to copy.


That's true of any data-structure though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: