From Fedora Project Wiki

Name resolution

Resolving using getaddrinfo() in applications

The getaddrinfo() function is a dualstack-friendly API to name resolution. It is used by applications to translate host and service names to a linked list of struct addrinfo objects.

Running getaddrinfo()

And example of getaddrinfo() call:

const char *node = "www.fedoraproject.org";
const char *service = "http";
struct addrinfo hints = {
    .ai_family = AF_UNSPEC,
    .ai_socktype = SOCK_DGRAM,
    .ai_flags = 0,
    .ai_protocol = 0,
    .ai_canonname = NULL,
    .ai_addr = NULL,
    .ai_next = NULL
};
struct addrinfo *result;
int error;

error = getaddrinfo(node, service, &hints, &result);

The input of getaddrinfo() consists of node specification, service specification and further hints.

  • node: literal IPv4 or IPv6 address, or a hostname to be resolved
  • service: numeric port number or a symbolic service name
  • hints.ai_family: enable dualprotocol, IPv4-only or IPv6-only queries
  • hints.ai_socktype: select socket type (and thus protocol family)

getaddrinfo() can be futher tweaked with the hints.ai_flags. Other attributes are either not needed (ai_protocol) or not supposed to be set in hints (ai_canonname, ai_addr and ai_next).

On success, the error variable is assigned to 0 and result is pointed to a linked list of one or more struct addrinfo objects.

Never assume that getaddrinfo() returns only one result or that the first result actually works!

Using getaddrinfo() results

It is necesary to try all results until one successfully connects. This works perfectly for TCP connections as they can fail gracefully at this stage.

struct addrinfo *item;
int sock;

for (item = result; item; item = item->ai_next) {
    sock = socket(item->ai_family, item->ai_socktype, item->ai_protocol);

    if (sock == -1)
        continue;

    if (connect(sock, item->ai_addr, item->ai_addrlen) != -1) {
        fprintf(stderr, "Connected successfully.");
        break;
    }

    close(sock);
}

For UDP, connect() succeeds without contacting the other side (if you are using connect() with udp at all). Therefore you might want to perform additional actions (such as sending a message and recieving a reply) before crying out „success!“.

Freeing getaddrinfo() results

When we're done with the results, we'll free the linked list.

freeaddrinfo(result);

Using getaddrinfo() in Python

Python's socket.getaddrinfo() API tries to be a little bit more sane than the C API.

#!/usr/bin/python3

import sys, socket

host = "www.fedoraproject.org"
service = "http"
family = socket.AF_UNSPEC
socktype = socket.SOCK_DGRAM
protocol = 0
flags = 0

result = socket.getaddrinfo(host, service, family, socktype, protocol, flags)

sock = None
for family, socktype, protocol, canonname, sockaddr in result:
    try:
        sock = socket.socket(family, socktype, protocol)
    except socket.error:
        continue
    try:
        sock.connect(sockaddr)
        print("Successfully connected to: {}".format(sockaddr))
    except socket.error:
        sock.close()
        sock = None
        continue
    break

if sock is None:
    print("Failed to connect.", file=sys.stderr)
    sys.exit(1)

Tweaking getaddrinfo() flags

  • AI_NUMERICHOST: use literal address, don't perform host resolution
  • AI_PASSIVE: return socket addresses suitable for bind() instead of connect(), sendto() and sendmsg()
  • AI_NUMERICSERV: use numeric service, don't perform service resolution
  • AI_CANONNAME: save canonical name to the first result
  • AI_ADDRCONFIG: this never really worked, as far as I know
  • AI_V4MAPPED+AI_ALL: only with AF_INET6, return IPv4 addresses mapped into IPv6 space
  • AI_V4MAPPED: I don't see any real use for this, only returns mapped IPv4 if there are no IPv6 addresses

Flag AI_ADDRCONFIG considered harmful

As far as I know, AI_ADDRCONFIG was added for the following reasons:

  • Some buggy DNS servers would be confused by AAAA requests
  • Optimization of the number DNS queries

Currently, I'm aware of several documents that define AI_ADDRCONFIG:

  • POSIX1-2008: useless but harmless
  • RFC 3493 (informational): useless but (partially) breaks IPv4/IPv6 localhost
  • RFC 2553 (obsolete informational): useless but hopefully harmless
  • GLIBC getaddrinfo(3): like RFC 3493

Actual GLIBC getaddrinfo() behavior differs from the manual page.

Problem statement

Currently, any of the definitions above makes AI_ADDRCONFIG a noop when a link-local IPv6 address is present. These addresses are automatically added to interfaces that are otherwise used to connect to IPv4. Therefore, on a typical linux system, AI_ADDRCONFIG never meets its goals.

But it builds on a false assumption, that no IPv4 communication is feasible without a non-loopback address. But why would we have a loopback address if we can't use it for node-local communication? AI_ADDRCONFIG breaks localhost, localhost4, localhost6, 127.0.0.1, ::1 and more if there's no non-loopback address of the respective protocol.

This can happen if the computer is connected to an IPv4-only network or and IPv6-only network, when it loses IPv4 or IPv6 connectivity and when it's used offline.

What next

A possible solution for the first problem (that AI_ADDRCONFIG is useless) is to treat link-local addresses the same as loopback (or node-local) addresses. But this is even more harmful.

Fedora's GLIBC was patched to do exactly the above thing. The consequence was that even link-local IPv6 stopped working when a global IPv6 address was absent.


leaves out IPv6 link-local addresses which are automatically added to active interfaces. Therefore the whole AI_ADDRCONFIG is not in effect in the cases it was made for. A patch was used in Fedora, that also disregarded

The whole idea of filtering-out non-DNS addresses is flawed and breaks even IPv4 and IPv6 literals.

Filtering of non-DNS addresses in getaddrinfo() has no real use and it only causes problems. There's no reason to filter over the mere existence of addresses. Filtering over global address existence may only be desirable for global address resolution, which is DNS. But that should be done by the DNS resolver that only asks for addresses that make sense and only accepts addresses that it asks for.