metadat 3 days ago

How can the C++ statement below even compile? Since "1" and "2" are strings (not even chars).

  std::vector<int> v = {"1", "2"} // UB
Rather than undefined behavior I'd expect a type violation error. But my C++ has gotten a bit rusty.
  • GrantMoyer 3 days ago

    It's simple: "string" literals in C++ are pointers to characters. std::vector has a constructor which takes a begin and end iterator and creates a vector with contents copied from that range. Pointers are iterators in C++, and chars are implicitly convertible to ints, so overload resolution selects the aforementioned constructor. The pointer to "1" and the pointer to "2" don't point into the same object, so trying to iterate from one to the other eventually causes a dereference past the end of "1" which is undefined behavior.

    • metadat 3 days ago

      I don't know about simple, haha, but thank you so much for explaining in depth!

    • jiggawatts 2 days ago

      This... this is why I gave up C++ programming and have not looked back.

      • lionkor 2 days ago

        Good! More UB for the rest of us!

        • WJW 2 days ago

          Now now gentlemen, play nice. There is enough UB for all of us :)

          • ralferoo a day ago

            All UBs are belong to us!

            • saagarjha 12 hours ago

              Nah, C holds claim to some of it

  • mwkaufma 3 days ago

    It's matching std::vector's implicit constructor which takes two iterators, which can be initialized with any pointer. Note that it won't compile with one, three, or more string-literal char*s, but exactly two.

    • metadat 3 days ago

      Why only exactly two?

      The whole "UB but it still compiles" thing, is pretty gross.

      • comex 3 days ago

        Because the constructor takes two arguments, a "begin" iterator and an "end" iterator. And pointers count as valid iterators.

        Normally you wouldn't use the "= {}" syntax to invoke a constructor this way. Instead of `std::vector<int> v = {begin, end}` you would usually write `std::vector<int> v(begin, end)`. But for some reason C++11 decided to make those two things mostly equivalent.

        • gpderetta 2 days ago

          List initialization is quite nice, but the interaction with brace initialization introduced at the same time is a giant foot-gun.

        • BobbyJo 2 days ago

          > But for some reason C++XX decided to....

          Why I gave up C++.

          That and because somehow package management is still a nightmare.

      • UncleMeat 2 days ago

        This sort of UB detection requires bespoke work.

        The compiler probably can figure out that the begin and end iterators here are referencing different objects but if you add just a bit more complexity to the code then the compiler won't be able to prove that.

        • mwkaufma 2 days ago

          That's consistent with STL's move to use "explicit" and other narrowing annotations, as described in the article.

  • ramon156 3 days ago

    Id argue that c++ is not type safe. Yes, strings could be converted to ints in theory, but these types are obviously not the same, why not make it more explicit?

    • maleldil 2 days ago

      Type safety isn't binary but a spectrum. It's obvious that C++ is more type-safe than JS but less than Rust, which is less than Haskell.

      • eru 2 days ago

        Well, at some points it becomes even more complicated than just a spectrum.

        Your types in different languages just track different things.

        Eg Haskell's types (normally) don't track lifetimes nor ownership, but Rust does that. In contrast, Haskell likes to track whether side-effects like IO can occur at all, while Rust is happy to just let you eg open a file almost anywhere.

        • chuckadams 2 days ago

          Haskell has linear types now, so there's your lifetime and ownership, though thankfully they're optional... wrangling linear types makes Rust look friendly and lenient.

      • JonChesterfield 2 days ago

        C++ is obviously hilariously less type safe than JavaScript.

        JavaScript will convert values from one type to a other at runtime but it's always absolutely sure what the type of the value is.

        C++ will compile code that looks reasonable, decide it's UB, not tell you about that and proceed to do abject nonsense at runtime.

        Considering C++ the more "type safe" one of the two is so far from accurate that I wonder if you've mistyped the name of one of the languages.

        • eru 2 days ago

          I have a lot of sympathy for your point of view. Though have a look at Rust: it's generally considered much more typesafe than both JavaScript and C++, but it also has (some) dark corners of undefined behaviour with approximately all the same baggage as C++.

          See also https://news.ycombinator.com/item?id=8206562 for a different point of view: dynamically typed languages are equivalent to statically typed languages with just a single static type.

    • jandrewrogers 2 days ago

      You can make it both stricter and more explicit. This is a 30 year old API from when C++ was a very different language. In a modern context, this example was explicitly designed to be as loosely checked as it is, it isn't intrinsic.

      Backward compatibility means they can't change this API to not allow these cases even if it is straightforward to do so.

    • mofeien 2 days ago

      I agree. C++ is statically typed, as in everything has a type known at compile time. But it is also weakly typed, as in types can be be converted into others implicitly which only makes sense a fraction of the time and readily introduces accidental UB at other times.

  • ycombinatrix 2 days ago

    >my C++ has gotten a bit rusty

    i see what you did there

jimbob45 3 days ago

Was there ever a reason given as to why “explicit” was chosen over a hypothetical “implicit”?

  • pjmlp 2 days ago

    Many of the original C++ decisions come back to how it was supposed to be Typescript for C, which was a reason why it became widely used, and why some warts are the way they are.

    Like having C structs magically turn into C++ ones, thus implicit rules like these.

    Anyone that cares about C++ evolution should read "Design and Evolution of C++", not only for how it came to be, also for safety approaches over plain C, that Bjarne is stil arguing for to this day on WG21 meetings.

    • senkora 2 days ago

      I don't understand. When does a C89 struct display implicit conversion behavior that would justify making C++ class constructors implicit by-default?

      For example, the following code does not compile with either -std=c89 or -std=c++98, but does compile if we uncomment the constructor line:

          struct Foo {
            int x;
      
            /*public: Foo(int x) : x(x) {}*/
          };
      
          int main() {
            struct Foo foo = 5;
            return 0;
          }
      
      
          $ gcc -std=c89 tmp.c
          tmp.c: In function ‘main’:
          tmp.c:8:20: error: invalid initializer
              8 |   struct Foo foo = 5;
                |                    ^
          $ g++ -std=c++98 tmp.c
          tmp.c: In function ‘int main()’:
          tmp.c:8:20: error: conversion from ‘int’ to non-scalar 
          type ‘Foo’ requested
              8 |   struct Foo foo = 5;
      
      Maybe I'm missing something?
      • pjmlp 2 days ago

        You are missing that such struct initialization did not exist when C with Classes came to be.

        Additionally, you are missing the whole package of struct semantics in C++, that while they should at naked eye still look like C structs, they have to also support C++ struct semantics, of memory construction, copy assignment and bitwise comparisaion.

        Hence why structs and classes are the same, with the difference that structs are public by default, with code generated for keeping the bitwise C semantics, until any of those operations are redifined, at which point the compiler leaves out the job to the developer.

        • senkora 2 days ago

          Thank you for the response. I still don’t understand why it would be necessary for C++ to default to non-explicit constructors when C at the time did not have constructors and did not have constructor-like struct initialization that mimicked non-explicit constructors.

          It seems like an unforced error in the language design, rather than a concession to backwards-compatibility.

  • RcouF1uZ4gsC 3 days ago

    In C++ because of how it developed the defaults are not optimal. For example, constructors and conversions are implicit and you have to make them explicit. Variables are default mutable and you have to make them const. Local primitive variables are default unitialized by default.

    • JTyQZSnP3cQGa8B 3 days ago

      > Variables are default mutable and you have to make them const

      Except for the captured variables in a lambda which are const unless you use the mutable keyword. Not a bad idea though.

      • Maxatar 3 days ago

        And that's only true if you capture by value. If you capture by reference they remain mutable.

        • chombier 3 days ago

          I think this is because `mutable` qualifies the call operator of the lambda (like a reverse const qualifier) so by-value captures are effectively const during the call unless the lambda is marked `mutable`. References themselves are always const, but the referenced object may be modified through the reference depending on its constness even though the lambda is not `mutable`.

          Is there a way to force capture by const-reference by the way?

          • gpderetta 2 days ago

              int main() {
                 int x = 0;
                 [&x] { x= 1;}(); // works
                 [&x=std::as_const(x)] { x= 1;}(); // error: assignment of read-only reference 'x'
               }
            
            Not very pretty, but it works.
  • cjensen 3 days ago

    I don't know, but a hypothetical 'implicit' would mean converting all unmarked constructors into explicit constructors... which would have broken a lot of existing code.

  • lenkite 2 days ago

    I guess if Stroustrup had made "explicit" constructors the default, the C-graybeards would never have adopted C++. Far too much work in porting.

geerlingguy 3 days ago

[flagged]

  • seanhunter 3 days ago

    Even if you think STL means St Louis, you can't make that sentence confusing becasue none of those other potential uses work in this place.

    eg "The St Louis" is only a possible construction if you put another noun after it like "How the STL cardinals handle the pressure of big games" (I don't know, I don't really do sports but you get the idea that's a valid sentence).

    You can't have a sentence "How the STL uses explicit" be at all confusing with any of the examples you gave.

    • saghm 3 days ago

      At least grammatically "the St. Louis" could make sense if you read it as referring to the namesake of the city. "Who was that guy using explicit contstructors, King Louis XIV?" "No, it was the _Saint_ Louis!"

      I assume that no one named Louis has been canonized since C++ was invented though, and even if it was, using "the" might not be enough to disambiguate in some contexts.

      • tialaramex 2 days ago

        I feel like you're obliged to omit "Louis" here? Certainly I would feel I can't add this superfluous word, whereas in some cases such words are optional I don't think one is here.

        I can say "No, it was the Saint" and "No, it was Saint Louis" but I don't think I'd utter "No, it was the Saint Louis" except as a speech error, maybe prompted by rushing like a Colemanball (commonly sports commentators mix metaphors or change their minds about intent midway through an utterance as of course they're live and events are unfolding as they speak, e.g. "That was a decisive mistake although nothing is decided yet").

        • saghm 2 days ago

          > I can say "No, it was the Saint" and "No, it was Saint Louis" but I don't think I'd utter "No, it was the Saint Louis" except as a speech error

          I don't think you're wrong per se, but this seems more like a personal style of speaking rather than a matter of correctness; some people might prefer conciseness, whereas others might speak more formally and avoid any implicitness. I think it also can feel quite a bit different depending on how the words are emphasized; I italicized "the" in my previous comment since it was the word being questioned, but it doesn't feel super jarring to me to add "the" when "Saint" is emphasized in your quote above. I can totally imagine myself or someone else being confused for a bit and after in the relief of finally understanding dragging out the words a bit like "ohhhhh, it was the _Saint_ Louis, I thought you meant the _other_ one".

      • seanhunter 3 days ago

        ...but do we know how the Saint Louis uses explicit? Catholic would-be C++ experts are dying to know.

        • saghm 2 days ago

          There's probably no way of knowing, unfortunately; when the meek inherit the earth, they don't have access to the constructors of the parent class.

    • geerlingguy 2 days ago

      Most people from St. Louis are used to hearing the phrasings "the STL", or "the Lou" when referring to the city.