NOTE: Every post ends with "END OF POST". If you don't see it then open the full post in a separate page!

Using std::unique_ptr (RAII) with malloc() and free()


This is a short post about using std::unique_ptr with malloc() and free(). Although, it can be used with other resource management functions too (files, sockets, etc.).

Often, when working with legacy C and C++ code, we see usages of malloc() and free(). Just like new and delete, explicit memory management should be hidden in the guts of libraries whenever possible and never be exposed to the casual programmer.

It would be great if the legacy code could be easily changed to use std::make_unique() to make it nice clean and safe at the same time. Unfortunately, std::make_unique() uses new and delete operators internally. So, if our legacy code uses some custom functions to allocate and deallocate memory then we may be forced to do more refactoring than we might have time for (e.g. to use new expressions instead of custom or malloc based allocators).

Luckily, we can still get the benefit of RAII by using std::unique_ptr by trading off some cleanliness.
But std::unique_ptr takes a deleter type. No problem, we have decltype() to the rescue!

#include <memory>

int main()
{
    auto Data =
        std::unique_ptr<double, decltype(free)*>{
            reinterpret_cast<double*>(malloc(sizeof(double)*50)),
            free };
    return 0;
}

The decltype(free) gives back a function type of “void (void*)” but we need a pointer to a function. So, we say “decltype(free)*” which gives us “void (*)(void*)”. Excellent!

A bit awkward, but it is still nice, since it does RAII (automatic free) and both the allocator (malloc()) and the deallocator (free()) is clearly visible to the reader.

With decltype() we don’t have to write our own deleter functor like in this example:

#include <memory>

struct MyDeleter
{
    void operator()(double *p) { free(p); }
};

int main()
{
    auto Data =
        std::unique_ptr<double, MyDeleter>{
            reinterpret_cast<double*>(malloc(sizeof(double) * 50)) };
    return 0;
}

Also, with decltype() we don’t have to spell out the type of the deleter like in this example:

#include <memory>

int main()
{
    auto Data =
        std::unique_ptr<double, void(*)(void*)>{
            reinterpret_cast<double*>(malloc(sizeof(double) * 50)),
            free };
    return 0;
}

So, with std::unique_ptr you can quickly hack RAII into legacy code. However, as a general guideline, prefer refactoring legacy code using modern C++.

END OF POST

Advertisements

Emulating in, out and inout function parameters – Part 2


Emulating in, out and inout function parameters in C++.

This is the continuation of Part 1 of this post.

In Part 1, we created classes to represent Input, Output and Input-Output parameters and arguments.
Here is an example how those classes can be used:

#include "param_inout.hpp"
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
using namespace param_inout;
using namespace std;
#define PRINT(arg) cout << __FUNCTION__ << ":" << __LINE__ << ": "<< arg << " " << endl

double func1(inp<int> p) {
    // p = 1; // Error: p is read-only.
    return p * 2.2;
}

void func2(outp<int> p) {
    // int a = p; // Error: p is write-only.
    p = 88;
}

void func3(inoutp<string> p) {
    auto t = string(p); // p is readable.
    p = "Hello ";       // p is writable.
    p = string(p) + t;  // p is readable and writable.
}

void func4(inp<string> pattern,
           inp<vector<string>> items,
           inoutp<string> message,
           outp<int> matchCount) {
    PRINT(message.arg());
    auto& ritems = items.arg();
    matchCount = count_if(begin(ritems), end(ritems),
        [&](const string& a) { return a.find(pattern) != string::npos; });
    message = "Done";
}

void func5(inp<int> p1, outp<int> p2, inoutp<string> p3) {
    // func1(p1); // Error! Good! inp::inp(const inp&) is private.
    func1(ina(p1));
    func2(outa(p2));
    func3(inouta(p3));
}

int main() {
    // auto a = func1(ina(2.2)); // Error: Cannot convert ina<double> to ina<int>
    // auto a = func1(2); // Error: inp::inp(const int&) is private.
    auto a0 = func1(ina(static_cast<int>(2.2)));   PRINT(a0);
    auto a1 = func1(ina(2));                       PRINT(a1);
    auto a2 = 0;                func2(outa(a2));   PRINT(a2);
    auto a3 = string{"world!"}; func3(inouta(a3)); PRINT(a3);

    auto a4 = vector<string>{"avocado", "apple", "plum", "apricot", "orange"};
    auto a5 = string{"Searching..."};
    auto a6 = 0;
    func4(ina(string{ "ap" }), ina(a4), inouta(a5), outa(a6));
    PRINT(a5);  PRINT(a6);

    func5(ina(5), outa(a6), inouta(a5));  PRINT(a5); PRINT(a6);

    return 0;
}

This example code produces the following output:

main:47: 4.4
main:48: 4.4
main:49: 88
main:50: Hello world!
func4:30: Searching...
main:56: Done
main:56: 2
main:58: Hello Done
main:58: 88

In the code example above, it is clear at the parameter declaration how each parameter is used by the function. Also, for declaration, we chose the convention to include a trailing “p” after the category. For example, outp signifies an Output Function Parameter. It says that it’s for output and it also says that it’s a parameter (i.e. not an argument).

At the call sites, we pass function arguments using ina (input), outa (output) and inouta (input and output). This way, we can clearly see that they are Function Arguments (not parameters); and how the function is going to use those arguments. No surprises.

It is also obvious how to construct function declarations ourselves. For example, the simple functions at the beginning of this post may be declared like this, regardless who does it:

void f(inoutp<string> s);
void g(inp<string> s);

void caller() {
  auto a1 = string{"hello"};
  f(inouta(a1));
  g(ina(a1));
}

And our std::vector example will become this:

void f(inp<vector<string>> v);

It is clear what is happening both at the declaration and at the call site. Also, this convention is much easier and obvious to follow.

References

References about the complexity of function parameter declaration guidelines

END OF POST


Emulating in, out and inout function parameters – Part 1


Emulating in, out and inout function parameters in C++.

In C++, passing arguments to functions can be done in a variety of ways. If you are not careful, even thou your function works as intended, the way its parameters are declared can easily mislead the caller.

Consider the following simple examples:

void f(string *s);
void g(string &s);

void caller() {
  auto a1 = string{"hello"};
  f(&a1);
  g(a1);
}

Is function f going to change s? Is it so that the writer meant “const char *” but the const qualifier is missing by mistake?
Same applies to function g. Also, when calling g, it is not apparent that it may change a1. It looks like it takes a1 by value.

These and similar issues can be avoided by laying down coding conventions about “how to name functions” and “how to declare function parameters” for your team members in your project. The problem with such guidelines is that it may be hard to follow. Even in simple cases they may not be obvious. For example, we may know that our compiler can copy std::vector by merely copying a pointer. So you automatically declare the input parameter like this:

void f(vector<string> v);

On the other hand, other people may not know this. In this case, they tend to declare the same kind of input parameter like this:

void f(const vector<string>& v);

Both of these are correct but it leads to inconsistency and confusion. Also, guidelines about “how to declare function parameters” can become quite complex considering when and how to use pointers, references, const qualifiers, pass-by-value, etc in function declarations.

99% of the time, we can categorize function parameters as

  • Input: Parameters that are only read by the function. They are not changed.
  • Output: Parameters that are only written by the function. The value of the corresponding argument that the caller passes in is not relevant to the function. These parameters are products of the function and the caller sees their new values when the function returns.
  • Input and output: Parameters that are both read and written by the function.

Some languages, like C#, provide standard tools for specifying these categories for function parameters. C++ does not provide standard tools for this. Fortunately, C++ is a very flexible language and we can roll our own tools to achieve this.

Here is one way to implement such a feature:

#pragma once
namespace param_inout {
    // Input
    template <typename T> class inp;
    template <typename T> inp<T> ina(const T&);
    template <typename T> inp<T> ina(const inp<T>&);
    template <typename T>
    class inp {
    public:
        inp(inp&& other) : m_arg{other.m_arg} { /* empty */ }
        operator const T&() const { return m_arg; }
        const T& arg() const { return m_arg; }
    private:
        inp(const inp&) = delete;
        inp(const T& arg) : m_arg{arg} { /* empty */ }
        friend inp<T> ina<T>(const T&);
        friend inp<T> ina<T>(const inp<T>& arg);
        const T& m_arg;
    };
    template <typename T>
    inp<T> ina(const T& arg) { return inp<T>{arg}; }
    template <typename T>
    inp<T> ina(const inp<T>& param) { return inp<T>{param.m_arg}; }

    // Output
    template <typename T> class outp;
    template <typename T> outp<T> outa(T&);
    template <typename T> outp<T> outa(outp<T>&);
    template <typename T>
    class outp {
    public:
        outp(outp&& other) : m_arg{other.m_arg} { /* empty */ }
        outp& operator=(const T& otherArg) { m_arg = otherArg; return *this; }
    private:
        outp(const outp&) = delete;
        outp(T& arg) : m_arg{arg} { /* empty */ }
        friend outp<T> outa<T>(T&);
        friend outp<T> outa<T>(outp<T>&);
        T& m_arg;
    };
    template <typename T>
    outp<T> outa(T& arg) { return outp<T>{arg}; }
    template <typename T>
    outp<T> outa(outp<T>& param) { return outp<T>{param.m_arg}; }

    // Input and output
    template <typename T> class inoutp;
    template <typename T> inoutp<T> inouta(T&);
    template <typename T> inoutp<T> inouta(inoutp<T>&);
    template <typename T>
    class inoutp {
    public:
        inoutp(inoutp&& other) : m_arg{other.m_arg} { /* empty */ }
        operator T&() { return m_arg; }
        T& arg() const { return m_arg; }
        inoutp& operator=(const T& otherArg) { m_arg = otherArg; return *this; }
    private:
        inoutp(const inoutp&) = delete;
        inoutp(T& arg) : m_arg{arg} { /* empty */ }
        friend inoutp<T> inouta<T>(T&);
        friend inoutp<T> inouta<T>(inoutp<T>&);
        T& m_arg;
    };
    template <typename T>
    inoutp<T> inouta(T& arg) { return inoutp<T>{arg}; }
    template <typename T>
    inoutp<T> inouta(inoutp<T>& param) { return inoutp<T>{param.m_arg}; }
}

These simple classes wrap around references to the actual function arguments. Why?

  • For Output and Input-Output parameters, taking a reference is necessary since we want to write into the arguments and make those writes visible to the caller.
  • For Input parameters, we take a const reference. It is a reference because most of the time it is “efficient enough”. It is const because it is a reference and we want it to be read-only. If we want, we can specialize it for simple types, like char or int, to store a copy and not a reference. For the sake of consistency and simplicity, even if you choose to specialize it, it’s probably better to always treat inp<T> as a reference. That is, make a copy of its contents (inp<T>.arg()) if you want to save it somewhere after the function returns.

In Part 2 of this post, we look at how these classes can be used.
Continue to Part 2.

END OF POST


The A::Restricted idiom


Fine grained access control to private members of a class

Sometimes I wish I could control the access to a class in a finer way.
Usually you have these tools in your arsenal:

  • private
  • protected
  • public
  • friend

Private is fine, since most of the stuff in a class should be hidden from all, in order to maximize encapsulation.

Protected parts can be accessed by anybody who derives from your class. This opens your class too much, i.e. to every deriving class. Making these protected parts private later on can be impossible due to the number of deriving classes or due to not knowing who may derive from you (e.g. if you are a library). This can considerably hinder refactoring efforts.

Public parts have the same problems as the protected parts and more. The whole world can see your public parts.

Making a non-related entity a friend of your class also exposes too much. The friend can access every part of your class even if it does not need to.

The bottom line is, if you open up your class with protected or public then the encapsulation of your class is hurt badly. Also, making non-members as friends is almost always unnecessarily generous.

We would need better support in the core language for more specific control over who can access and what. Something like this:

// WARNING! This is NOT C++! The "public(...)" is fictional.

class A {
public(class B, class D): // Private for all but public for B and D.
  void f();
private: // Private for all.
  int a;
};

class B : public A {
   void h() { f(); } // OK! A::f() is public for B.
};

class C : public A {
   void h() { f(); } // Error! A::f() is private for C.
};

class D {
   void h(A& a) { a.f(); } // OK! A::f() is public for D.
};

In the above code, A::f() is marked as public only for classes B and D. For everybody else, it is private (default access for class members).
Unfortunately, the public access modifier in C++ does not support this syntax and semantics.

Luckily, there is a workaround to emulate this kind of behavior. Here is one way to do this:

// a.hpp
#pragma once

class A {
private:
    void f();
    virtual void g();
    double d;

public:
    A();

    class Restricted {
    private:
        A& parent;
        Restricted(A& p);
        // Proxy functions
        void f();
        void g();
        // Friends of Restricted
        friend class A;
        friend class B;
        friend class D;
    };
    Restricted restricted;
};
// a.cpp
#include "a.hpp"
#include

A::A() : restricted{*this} { }
void A::f() { std::cout << "A::f()" << std::endl; }
void A::g() { std::cout << "A::g()" << std::endl; }

A::Restricted::Restricted(A& p) : parent{p} { }
void A::Restricted::f() { parent.f(); }
void A::Restricted::g() { parent.g(); }

Class A has some private parts, f, g and d. From these parts, we want to expose only the functions and we want to strictly control who can access them.

For this, we define an inner Restricted class. In Restricted, everything is private. We open it up only for selected entities; here A, B and D. We create proxy functions whose task is to forward the call to those parts in A that we want to make accessible to the friends of Restricted. The Restricted class has a reference to the outer A object to which it forwards the calls.

The friends of Restricted can access Restricted::parent, but this is not a problem at all. The parent is a reference, so it cannot be changed to point to some other A after construction. Also, only the public parts of A can be access through parent. The encapsulation of A is not weakened.

The friends of Restricted can access Restricted::Restricted(A&) and construct Restricted objects, but this is not a problem either. In the worst case, this can result in multiple Restricted objects referencing the same A object. Again, the encapsulation of A is not weakened because through these Restricted::parent references only the public parts of A can be accessed. Also, through these Restricted objects only the selected private parts of A can be accessed (here A::f() via A::Restricted::f() and A::g() via A::Restricted::g()).

In class A, everything is private and only the very minimum is made public. A::A() is public because we want to allow anybody to create A objects. A::restricted is public because otherwise the friends (e.g. B) specified inside Restricted cannot access it.

The examples below show how class A can be used. Access control to A::Restricted is managed strictly by A via the friends of A::Restricted. So, nobody else can gain access without the “permission” of A.

We create a class B deriving from A. B is a friend of A::Restricted. This gives B access to the restricted parts of A.

// b.hpp
#pragma once
#include "a.hpp"
class B : public A {
public:
    void h();
private:
    virtual void g() override;
};
// b.cpp
#include "b.hpp"
#include

void B::h() {
    std::cout << "B::h() enter" << std::endl;
    // ++d;         // Error! A::d is private.
    // f();         // Error! A::f() is private.
    g();            // OK! B::g() is accessible here.
    restricted.f(); // OK! A::Restricted::f() is public here.
    restricted.g(); // OK! A::Restricted::g() is public here.
    std::cout << "B::h() exit" << std::endl;
}

void B::g() { std::cout << "B::g()" << std::endl; }

We create a class C deriving from A, but we do not give it access to the restricted parts of A.

// c.hpp
#pragma once
#include "a.hpp"
class C : public A {
public:
    void h();
};
// c.cpp
#include "c.hpp"
void C::h() {
    // f();             // Error! A::f() is private.
    // g();             // Error! A::g() is private.
    // restricted.f();  // Error! A::Restricted::f() is private.
    // restricted.g();  // Error! A::Restricted::g() is private.
}

We create a class D that is not related to A. Yet, it can access the restricted parts of A because we explicitly allow it.

// d.hpp
#pragma once
class D {
public:
    void h(class A&);
};
// d.cpp
#include "d.hpp"
#include "a.hpp"
void D::h(A& a) {
    // a.f();            // Error! A::f() is private.
    // a.g();            // Error! A::g() is private.
    a.restricted.f(); // OK! A::Restricted::f() is public here.
    a.restricted.g(); // OK! A::Restricted::g() is public here.
}
// main.cpp
#include "b.hpp"
#include "c.hpp"
#include "d.hpp"
int main() {
    auto b = B{};    b.h();
    // b.f();            // Error! A::f() is private.
    // b.g();            // Error! B::g() is private.
    // b.restricted.f(); // Error! A::Restricted::f() is private.
    auto c = C{};    c.h();
    auto d = D{};    d.h(b);
    return 0;
}

In this example, only B and D can access A::f() and A::g(), but only through A::Restricted. Nobody else can. A::d remains private for everybody. And this I call the A::Restricted idiom.
Here is the output of this example program:

B::h() enter
B::g()
A::f()
B::g()
B::h() exit
A::f()
B::g()

This idiom is a variation of the Attorney-Client Idiom. I prefer this variant (i.e. A::Restricted) because it provides a more convenient and more intuitive syntax for accessing the restricted parts. This is achieved by the automatic wiring between class A and class Restricted.

References

END OF POST


The Widget::O idiom


A well encapsulated design makes only the minimum necessary parts of a class public (public or protected). In other words, everything should be private by default and make things public (or protected) only if absolutely necessary.
In C++, one tool for this is to implement parts of your class as non-member non-friend functions. These functions should be put in a namespace so, they don’t litter the global namespace. If you have a class Widget then usually people define a WidgetHelper namespace in which they put the Widget class and also its helper functions:

class Widget {
public:
  Widget() : width{}, height{} {}
  void setWidth(int w) { width = w; }
  void setHeight(int h) { height = h; }
private:
  int width;
  int height;
};

namespace WidgetHelper {
  void setSize(Widget& o, int w, int h) {
    o.setWidth(w);
    o.setHeight(h);
  }
}

int main() {
  auto w = Widget{};
  WidgetHelper::setSize(w, 10, 20);
  return 0;
}

A cleaner way is to use something I call the Widget::O idiom. Here it is how the above code looks like using the Widget::O idiom:

namespace Widget {
  class O {
  public:
    O() : width{}, height{} {}
    void setWidth(int w) { width = w; }
    void setHeight(int h) { height = h; }
  private:
    int width;
    int height;
  };

  void setSize(O& o, int w, int h) {
    o.setWidth(w);
    o.setHeight(h);
  }
}

int main() {
  auto w = Widget::O{};
  setSize(w, 10, 20); // Syntax with ADL.
  Widget::setSize(w, 30, 40); // Syntax without ADL.
  return 0;
}

The Widget::O idiom helps you keeping all the functions and data related to Widget::O in the same Widget namespace, regardless whether they are members or non-members. With this idiom, you can use also use ADL (Argument Dependent Lookup) if you prefer when calling the non-member functions.

END OF POST


Emulating final


Emulating the final keyword of C++11 for disallowing inheritance

Here is a way to emulate the “final” feature of C++11 for marking classes non-inheritable. This works with C++98.

// Definition:
namespace xxx_internal
{
  template<typename T>
  class make_final {
  private:
    make_final() {}
    friend T;
  };
}
#define FINAL(T) virtual xxx_internal::make_final<T>

// Usage example:
class A {};
class B : public A, FINAL(B) {};
class C : public B {};
int main()
{
    B b;
    C c;
    return 0;
}

In the above code, with the FINAL macro, class B privately inherits from class make_final. This means that class B is “implemented in terms of” class make_final but “it is not” a class make_final.
Class B inherits also virtually from make_final. This means that if some other class (e.g. class C) tries to inherit (directly or indirectly) from class B, then this inheriting class must call a constructor of make_final (due to virtual inheritance).
So, if we make the constructors of make_final private, then class C cannot call them and we get a compilation error. This makes inheritance from B impossible, thus B is “final”.
Also, we still want B to be usable on its own so, we mark it as a friend of make_final. This way B can call the private constructor of make_final, but nobody else can.

The above example code gives the following output (GNU GCC version 4.8.1):

$ g++ main.cpp -o demo -lm -pthread -lgmpxx -lgmp -lreadline 2>&1
main.cpp: In constructor 'C::C()':
main.cpp:6:5: error: 'xxx_do_not_use::make_final<T>::make_final() [with T = B]' is private
     make_final() {}
     ^
main.cpp:14:7: error: within this context
 class C : public B {};
       ^
main.cpp: In function 'int main()':
main.cpp:18:7: note: synthesized method 'C::C()' first required here 
     C c;
       ^

END OF POST