Econstudentlog

Code Complete (I)

Here’s what I wrote about the book elsewhere not long ago: “I recently started reading the book Code Complete by Steve McConnell. I’ve only read the first 100 pages so far (Kindle estimate of time remaining: 27 hours… – then again, it is ‎960 pages..) but I can already confidently say at this point that if you’re a software developer or programmer or similar, or plan to be, then you’ll want to read this book – it’s awesome. (…and even just reading the first 50 pages of this book would probably make you a better programmer, even if you read no more than that…)”

I enjoy reading it and I learn something new on many of the pages here, or perhaps get a new angle on a topic I have familiarity with – I’ve already come across multiple important insights that I’d really wish I’d have known about when involved in projects in the past, and I’m trying to share some of these learnings also with my coworkers. It’s just a good book. The fact that it’s already almost 20 years old of course means that there isn’t a great deal of coverage about, say, the topics touched upon in the lecture below this post, but a lot of this stuff is about fundamentals, concepts, and tradeoffs, which means that this aspect actually matters probably significantly less than you’d think. Not all suggestions made are the sort of suggestions I feel tempted to immediately follow/implement in my daily work, but most of them at the very least makes you think a bit more about the choices you might be making, often subconsciously – and as the quotes below should incidentally serve to illustrate it’s not just a book about coding.

I have added some sample quotes from the chapters I’ve read so far below.

“Construction is a large part of software development. Depending on the size of the project, construction typically takes 30 to 80 percent of the total time spent on a project. […] Construction is the central activity in software development. Requirements and architecture are done before construction so that you can do construction effectively. System testing (in the strict sense of independent testing) is done after construction to verify that construction has been done correctly. […] With a focus on construction, the individual programmer’s productivity can improve enormously. A classic study by Sackman, Erikson, and Grant showed that the productivity of individual programmers varied by a factor of 10 to 20 during construction (1968). Since their study, their results have been confirmed by numerous other studies (Curtis 1981, Mills 1983, Curtis et al. 1986, Card 1987, Valett and McGarry 1989, DeMarco and Lister 1999, Boehm et al. 2000). This book helps all programmers learn techniques that are already used by the best programmers.”

“As much as 90 percent of the development effort on a typical software system comes after its initial release, with two-thirds being typical (Pigoski, 1997).”
“It generally doesn’t make sense to code things you can buy ready-made.”
“Choosing the right tool for each problem is one key to being an effective programmer.”
“Good architecture makes construction easy. Bad architecture makes construction almost impossible.”
“Good software architecture is largely machine- and language-independent.”
“Part of a programmer’s job is to educate bosses and coworkers about the software-development process, including the importance of adequate preparation before programming begins.”

“Both building construction and software construction benefit from appropriate levels of planning. If you build software in the wrong order, it’s hard to code, hard to test, and hard to debug. It can take longer to complete, or the project can fall apart because everyone’s work is too complex and therefore too confusing when it’s all combined. Careful planning doesn’t necessarily mean exhaustive planning or over-planning. You can plan out the structural supports and decide later whether to put in hardwood floors or carpeting, what color to paint the walls, what roofing material to use, and so on. A well-planned project improves your ability to change your mind later about details. The more experience you have with the kind of software you’re building, the more details you can take for granted. You just want to be sure that you plan enough so that lack of planning doesn’t create major problems later.”

“The overarching goal of preparation is risk reduction: a good project planner clears major risks out of the way as early as possible so that the bulk of the project can proceed as smoothly as possible. By far the most common project risks in software development are poor requirements and poor project planning […] You might think that all professional programmers know about the importance of preparation and check that the prerequisites have been satisfied before jumping into construction. Unfortunately, that isn’t so. A common cause of incomplete preparation is that the developers who are assigned to work on the upstream activities do not have the expertise to carry out their assignments. The skills needed to plan a project, create a compelling business case, develop comprehensive and accurate requirements, and create high-quality architectures are far from trivial, but most developers have not received training in how to perform these activities. […] Some programmers do know how to perform upstream activities, but they don’t prepare because they can’t resist the urge to begin coding as soon as possible. […] It takes only a few large programs to learn that you can avoid a lot of stress by planning ahead. Let your own experience be your guide. A final reason that programmers don’t prepare is that managers are notoriously unsympathetic to programmers who spend time on construction prerequisites.”

“One of the key ideas in effective programming is that preparation is important. It makes sense that before you start working on a big project, you should plan the project. Big projects require more planning; small projects require less. […] Researchers at Hewlett-Packard, IBM, Hughes Aircraft, TRW, and other organizations have found that purging an error by the beginning of construction allows rework to be done 10 to 100 times less expensively than when it’s done in the last part of the process, during system test or after release […]. In general, the principle is to find an error as close as possible to the time at which it was introduced. The longer the defect stays in the software food chain, the more damage it causes further down the chain. Since requirements are done first, requirements defects have the potential to be in the system longer and to be more expensive. Defects inserted into the software upstream also tend to have broader effects than those inserted further downstream. That also makes early defects more expensive. […] for example, an architecture defect that costs $1000 to fix when the architecture is being created can cost $15,000 to fix during system test. […] The cost to fix a defect rises dramatically as the time from when it’s introduced to when it’s detected increases. This remains true whether the project is highly sequential (doing 100 percent of requirements and design up front) or highly iterative (doing 5 percent of requirements and design up front). […] Dozens of companies have found that simply focusing on correcting defects earlier rather than later in a project can cut development costs and schedules by factors of two or more […]. This is a healthy incentive to find and fix your problems as early as you can.”

“Accommodating changes is one of the most challenging aspects of good program design. The goal is to isolate unstable areas so that the effect of a change will be limited to one routine, class, or package. […] Business rules tend to be the source of frequent software changes. […] Business systems projects tend to benefit from highly iterative approaches, in which planning, requirements, and architecture are interleaved with construction, system testing, and quality-assurance activities. […] Iterative approaches tend to reduce the impact of inadequate upstream work, but they don’t eliminate it. […] Iterative approaches are usually a better option for many reasons, but an iterative approach that ignores prerequisites can end up costing significantly more than a sequential project that pays close attention to prerequisites. […] One common rule of thumb is to plan to specify about 80 percent of the requirements up front, allocate time for additional requirements to be specified later, and then practice systematic change control to accept only the most valuable new requirements as the project progresses. Another alternative is to specify only the most important 20 percent of the requirements up front and plan to develop the rest of the software in small increments, specifying additional requirements and designs as you go. […] One key to successful construction is understanding the degree to which prerequisites have been completed and adjusting your approach accordingly […] The extent to which prerequisites need to be satisfied up front will vary with the project type […], project formality, technical environment, staff capabilities, and project business goals. […] Software being what it is, iterative approaches are useful much more often than sequential approaches are. […] Some projects do too much up front; they doggedly adhere to requirements and plans that have been invalidated by down-stream discoveries, and that can also impede progress during construction.”

“[D]ata from numerous organizations indicates that on large projects an error in requirements detected during the architecture stage is typically 3 times as expensive to correct as it would be if it were detected during the requirements stage. If detected during coding, it’s 5–10 times as expensive; during system test, 10 times; and post-release, a whopping 10–100 times as expensive as it would be if it were detected during requirements development. On smaller projects with lower administrative costs, the multiplier post-release is closer to 5–10 than 100 (Boehm and Turner 2004). […] Specifying requirements adequately is a key to project success, perhaps even more important than effective construction techniques. […] Stable requirements are the holy grail of software development. With stable requirements, a project can proceed from architecture to design to coding to testing in a way that’s orderly, predictable, and calm. […] It’s fine to hope that once your customer has accepted a requirements document, no changes will be needed. On a typical project, however, the customer can’t reliably describe what is needed before the code is written. The problem isn’t that the customers are a lower life form. Just as the more you work with the project, the better you understand it, the more they work with it, the better they understand it. The development process helps customers better understand their own needs, and this is a major source of requirements changes […]. A plan to follow the requirements rigidly is actually a plan not to respond to your customer. How much change is typical? Studies at IBM and other companies have found that the average project experiences about a 25 percent change in requirements during development […], which accounts for 70 to 85 percent of the rework on a typical project […] Make sure everyone knows the cost of requirements changes. […] say, “Gee, that sounds like a great idea. Since it’s not in the requirements document, I’ll work up a revised schedule and cost estimate so that you can decide whether you want to do it now or later.” The words “schedule” and “cost” are more sobering than coffee and a cold shower […] Set up a change-control procedure. […] Having a built-in procedure for controlling changes makes everyone happy. You’re happy because you know that you’ll have to work with changes only at specific times. Your customers are happy because they know that you have a plan for handling their input.””Use development approaches that accommodate changes. Some development approaches maximize your ability to respond to changing requirements. An evolutionary prototyping approach helps you explore a system’s requirements before you send your forces in to build it. Evolutionary delivery is an approach that delivers the system in stages. You can build a little, get a little feedback from your users, adjust your design a little, make a few changes, and build a little more. The key is using short development cycles so that you can respond to your users quickly. […] Keep your eye on the business case for the project. Many requirements issues disappear before your eyes when you refer back to the business reason for doing the project. Requirements that seemed like good ideas when considered as “features” can seem like terrible ideas when you evaluate the “incremental business value.””

“The amount of time to spend on problem definition, requirements, and software architecture varies according to the needs of your project. Generally, a well-run project devotes about 10 to 20 percent of its effort and about 20 to 30 percent of its schedule to requirements, architecture, and up-front planning […]. These figures don’t include time for detailed design—that’s part of construction. […] If requirements are unstable and you’re working on a small, informal project, you’ll probably need to resolve requirements issues yourself. Allow time for defining the requirements well enough that their volatility will have a minimal impact on construction. If the requirements are unstable on any project — formal or informal — treat requirements work as its own project. Estimate the time for the rest of the project after you’ve finished the requirements. This is a sensible approach since no one can reasonably expect you to estimate your schedule before you know what you’re building. It’s as if you were a contractor called to work on a house. Your customer says, “What will it cost to do the work?” You reasonably ask, “What do you want me to do?” Your customer says, “I can’t tell you, but how much will it cost?” You reasonably thank the customer for wasting your time and go home.”

“When software-project surveys report causes of project failure, they rarely identify technical reasons as the primary causes of project failure. Projects fail most often because of poor requirements, poor planning, or poor management. But when projects do fail for reasons that are primarily technical, the reason is often uncontrolled complexity. The software is allowed to grow so complex that no one really knows what it does. When a project reaches the point at which no one completely understands the impact that code changes in one area will have on other areas, progress grinds to a halt. […] Managing complexity is the most important technical topic in software development. In my view, it’s so important that Software’s Primary Technical Imperative has to be managing complexity. […] The goal is to minimize the amount of a program you have to think about at any one time. […] The goal of all software-design techniques is to break a complicated problem into simple pieces. The more independent the subsystems are, the more you make it safe to focus on one bit of complexity at a time. Carefully defined objects separate concerns so that you can focus on one thing at a time. Packages provide the same benefit at a higher level of aggregation. Keeping routines short helps reduce your mental workload. Writing programs in terms of the problem domain, rather than in terms of low-level implementation details, and working at the highest level of abstraction reduce the load on your brain. The bottom line is that programmers who compensate for inherent human limitations write code that’s easier for themselves and others to understand and that has fewer errors. […] Once you understand that all other technical goals in software are secondary to managing complexity, many design considerations become straightforward.”

November 23, 2021 Posted by | Books, Computer science | Leave a comment

Imitation Games – Avi Wigderson

If you wish to skip the introduction the talk starts at 5.20. The talk itself lasts roughly an hour, with the last ca. 20 minutes devoted to Q&A – that part is worth watching as well.

Some links related to the talk below:

Theory of computation.
Turing test.
COMPUTING MACHINERY AND INTELLIGENCE.
Probabilistic encryption & how to play mental poker keeping secret all partial information Goldwasser-Micali82.
Probabilistic algorithm
How To Generate Cryptographically Strong Sequences Of Pseudo-Random Bits (Blum&Micali, 1984)
Randomness extractor
Dense graph
Periodic sequence
Extremal graph theory
Szemerédi’s theorem
Green–Tao theorem
Szemerédi regularity lemma
New Proofs of the Green-Tao-Ziegler Dense Model Theorem: An Exposition
Calibrating Noise to Sensitivity in Private Data Analysis
Generalization in Adaptive Data Analysis and Holdout Reuse
Book: Math and Computation | Avi Wigderson
One-way function
Lattice-based cryptography

August 23, 2021 Posted by | Computer science, Cryptography, Data, Lectures, Mathematics, Science, Statistics | Leave a comment

Algorithms to live by…

“…algorithms are not confined to mathematics alone. When you cook bread from a recipe, you’re following an algorithm. When you knit a sweater from a pattern, you’re following an algorithm. When you put a sharp edge on a piece of flint by executing a precise sequence of strikes with the end of an antler—a key step in making fine stone tools—you’re following an algorithm. Algorithms have been a part of human technology ever since the Stone Age.

* * *

In this book, we explore the idea of human algorithm design—searching for better solutions to the challenges people encounter every day. Applying the lens of computer science to everyday life has consequences at many scales. Most immediately, it offers us practical, concrete suggestions for how to solve specific problems. Optimal stopping tells us when to look and when to leap. The explore/exploit tradeoff tells us how to find the balance between trying new things and enjoying our favorites. Sorting theory tells us how (and whether) to arrange our offices. Caching theory tells us how to fill our closets. Scheduling theory tells us how to fill our time. At the next level, computer science gives us a vocabulary for understanding the deeper principles at play in each of these domains. As Carl Sagan put it, “Science is a way of thinking much more than it is a body of knowledge.” Even in cases where life is too messy for us to expect a strict numerical analysis or a ready answer, using intuitions and concepts honed on the simpler forms of these problems offers us a way to understand the key issues and make progress. […] tackling real-world tasks requires being comfortable with chance, trading off time with accuracy, and using approximations.”

I recall Zach Weinersmith recommending the book, and I seem to recall him mentioning when he did so that he’d put off reading it ‘because it sounded like a self-help book’ (paraphrasing). I’m not actually sure how to categorize it but I do know that I really enjoyed it; I gave it five stars on goodreads and added it to my list of favourite books.

The book covers a variety of decision problems and tradeoffs which people face in their every day lives, as well as strategies for how to approach such problems and identify good solutions (if they exist). The explore/exploit tradeoff so often implicitly present (e.g.: ‘when to look for a new restaurant, vs. picking one you are already familiar with’, or perhaps: ‘when to spend time with friends you already know, vs. spending time trying to find new (/better?) friends?’), optimal stopping rules (‘at which point do you stop looking for a romantic partner and decide that ‘this one is the one’?’ – this is perhaps a well-known problem with a well-known solution, but had you considered that you might use the same analytical framework for questions such as: ‘when to stop looking for a better parking spot and just accept that this one is probably the best one you’ll be able to find?’?), sorting problems (good and bad ways of sorting, why sort, when is sorting even necessary/required?, etc.), scheduling theory (how to handle task management in a good way, so that you optimize over a given constraint set – some examples from this part are included in the quotes below), satisficing vs optimizing (heuristics, ‘when less is more’, etc.), etc. The book is mainly a computer science book, but it is also to some extent an implicitly interdisciplinary work covering material from a variety of other areas such as statistics, game theory, behavioral economics and psychology. There is a major focus throughout on providing insights which are actionable and can actually be used by the reader, e.g. through the translation of identified solutions to heuristics which might be applied in every day life. The book is more pop-science-like than any book I’d have liked to read 10 years ago, and there are too many personal anecdotes for my taste included, but in some sense this never felt like a major issue while I was reading; a lot of interesting ideas and topics are covered, and the amount of fluff is within acceptable limits – a related point is also that the ‘fluff’ is also part of what makes the book relevant, because the authors really focus on tradeoffs and problems which really are highly relevant to some potentially key aspects of most people’s lives, including their own.

Below I have added some sample quotes from the book. If you like the quotes you’ll like the book, it’s full of this kind of stuff. I definitely recommend it to anyone remotely interested in decision theory and related topics.

“…one of the deepest truths of machine learning is that, in fact, it’s not always better to use a more complex model, one that takes a greater number of factors into account. And the issue is not just that the extra factors might offer diminishing returns—performing better than a simpler model, but not enough to justify the added complexity. Rather, they might make our predictions dramatically worse. […] overfitting poses a danger every time we’re dealing with noise or mismeasurement – and we almost always are. […] Many prediction algorithms […] start out by searching for the single most important factor rather than jumping to a multi-factor model. Only after finding that first factor do they look for the next most important factor to add to the model, then the next, and so on. Their models can therefore be kept from becoming overly complex simply by stopping the process short, before overfitting has had a chance to creep in. […] This kind of setup — where more time means more complexity — characterizes a lot of human endeavors. Giving yourself more time to decide about something does not necessarily mean that you’ll make a better decision. But it does guarantee that you’ll end up considering more factors, more hypotheticals, more pros and cons, and thus risk overfitting. […] The effectiveness of regularization in all kinds of machine-learning tasks suggests that we can make better decisions by deliberately thinking and doing less. If the factors we come up with first are likely to be the most important ones, then beyond a certain point thinking more about a problem is not only going to be a waste of time and effort — it will lead us to worse solutions. […] sometimes it’s not a matter of choosing between being rational and going with our first instinct. Going with our first instinct can be the rational solution. The more complex, unstable, and uncertain the decision, the more rational an approach that is.” (…for more on these topics I recommend Gigerenzer)

“If you’re concerned with minimizing maximum lateness, then the best strategy is to start with the task due soonest and work your way toward the task due last. This strategy, known as Earliest Due Date, is fairly intuitive. […] Sometimes due dates aren’t our primary concern and we just want to get stuff done: as much stuff, as quickly as possible. It turns out that translating this seemingly simple desire into an explicit scheduling metric is harder than it sounds. One approach is to take an outsider’s perspective. We’ve noted that in single-machine scheduling, nothing we do can change how long it will take us to finish all of our tasks — but if each task, for instance, represents a waiting client, then there is a way to take up as little of their collective time as possible. Imagine starting on Monday morning with a four-day project and a one-day project on your agenda. If you deliver the bigger project on Thursday afternoon (4 days elapsed) and then the small one on Friday afternoon (5 days elapsed), the clients will have waited a total of 4 + 5 = 9 days. If you reverse the order, however, you can finish the small project on Monday and the big one on Friday, with the clients waiting a total of only 1 + 5 = 6 days. It’s a full workweek for you either way, but now you’ve saved your clients three days of their combined time. Scheduling theorists call this metric the “sum of completion times.” Minimizing the sum of completion times leads to a very simple optimal algorithm called Shortest Processing Time: always do the quickest task you can. Even if you don’t have impatient clients hanging on every job, Shortest Processing Time gets things done.”

“Of course, not all unfinished business is created equal. […] In scheduling, this difference of importance is captured in a variable known as weight. […] The optimal strategy for [minimizing weighted completion time] is a simple modification of Shortest Processing Time: divide the weight of each task by how long it will take to finish, and then work in order from the highest resulting importance-per-unit-time [..] to the lowest. […] this strategy … offers a nice rule of thumb: only prioritize a task that takes twice as long if it’s twice as important.”

“So far we have considered only factors that make scheduling harder. But there is one twist that can make it easier: being able to stop one task partway through and switch to another. This property, “preemption,” turns out to change the game dramatically. Minimizing maximum lateness … or the sum of completion times … both cross the line into intractability if some tasks can’t be started until a particular time. But they return to having efficient solutions once preemption is allowed. In both cases, the classic strategies — Earliest Due Date and Shortest Processing Time, respectively — remain the best, with a fairly straightforward modification. When a task’s starting time comes, compare that task to the one currently under way. If you’re working by Earliest Due Date and the new task is due even sooner than the current one, switch gears; otherwise stay the course. Likewise, if you’re working by Shortest Processing Time, and the new task can be finished faster than the current one, pause to take care of it first; otherwise, continue with what you were doing.”

“…even if you don’t know when tasks will begin, Earliest Due Date and Shortest Processing Time are still optimal strategies, able to guarantee you (on average) the best possible performance in the face of uncertainty. If assignments get tossed on your desk at unpredictable moments, the optimal strategy for minimizing maximum lateness is still the preemptive version of Earliest Due Date—switching to the job that just came up if it’s due sooner than the one you’re currently doing, and otherwise ignoring it. Similarly, the preemptive version of Shortest Processing Time—compare the time left to finish the current task to the time it would take to complete the new one—is still optimal for minimizing the sum of completion times. In fact, the weighted version of Shortest Processing Time is a pretty good candidate for best general-purpose scheduling strategy in the face of uncertainty. It offers a simple prescription for time management: each time a new piece of work comes in, divide its importance by the amount of time it will take to complete. If that figure is higher than for the task you’re currently doing, switch to the new one; otherwise stick with the current task. This algorithm is the closest thing that scheduling theory has to a skeleton key or Swiss Army knife, the optimal strategy not just for one flavor of problem but for many. Under certain assumptions it minimizes not just the sum of weighted completion times, as we might expect, but also the sum of the weights of the late jobs and the sum of the weighted lateness of those jobs.”

“…preemption isn’t free. Every time you switch tasks, you pay a price, known in computer science as a context switch. When a computer processor shifts its attention away from a given program, there’s always a certain amount of necessary overhead. […] It’s metawork. Every context switch is wasted time. Humans clearly have context-switching costs too. […] Part of what makes real-time scheduling so complex and interesting is that it is fundamentally a negotiation between two principles that aren’t fully compatible. These two principles are called responsiveness and throughput: how quickly you can respond to things, and how much you can get done overall. […] Establishing a minimum amount of time to spend on any one task helps to prevent a commitment to responsiveness from obliterating throughput […] The moral is that you should try to stay on a single task as long as possible without decreasing your responsiveness below the minimum acceptable limit. Decide how responsive you need to be — and then, if you want to get things done, be no more responsive than that. If you find yourself doing a lot of context switching because you’re tackling a heterogeneous collection of short tasks, you can also employ another idea from computer science: “interrupt coalescing.” If you have five credit card bills, for instance, don’t pay them as they arrive; take care of them all in one go when the fifth bill comes.”

February 10, 2021 Posted by | Books, Computer science, Economics, Game theory, Psychology | Leave a comment

Designing Fast and Robust Learning Algorithms – Yu Cheng

Some links related to the lecture’s coverage:

Recommender system.
Collaborative filtering.
Matrix completion.
Non-Convex Matrix Completion Against a Semi-Random Adversary (Cheng & Ge, 2018).
Singular value decomposition.
Spectral graph theory.
Spectral Sparsification of Graphs (Spielman & Teng).
Cut (graph theory).
Split (graph theory).
Robust statistics.
Being Robust (in High Dimensions) Can Be Practical (Diakonikolas et al).
High-Dimensional Robust Mean Estimation in Nearly-Linear Time (Cheng, Diakonikolas and Ge).

October 13, 2019 Posted by | Computer science, Lectures, Mathematics, Statistics | Leave a comment

A recent perspective on invariant theory

Some time ago I covered here on the blog a lecture with a somewhat technical introduction to invariant theory. Even though I didn’t recommend the lecture, I do recommend that you don’t watch the lecture above without first knowing the sort of stuff that might have been covered in that lecture (for all you know, that is), as well as some other lectures on related topics – to be more specific, to get anything out of this lecture you need some linear algebra, you need graph theory, you need some understanding of group theory, you need to know a little about computational complexity, it’ll probably help if you know a bit about invariant theory already, and surely you need some knowledge of a few other topics I forgot to mention. One of the statements I made about the introductory lecture to which I linked above also applies here: “I had to look up a lot of stuff to just sort-of-kind-of muddle along”.

Below some links to stuff I looked up while watching the lecture:

Algebraically closed field.
Reductive group.
Rational representation.
Group homomorphism.
Morphism of algebraic varieties.
Fixed-point subring.
Graph isomorphism.
Adjacency matrix.
Group action (mathematics).
General linear group.
Special linear group.
Alternating minimization, scaling algorithms, and the null-cone problem from invariant theory. (Bürgisser, Garg, Oliveira, Walter, and Wigderson (2017))
Noether normalization lemma.
Succinct data structure. (This link is actually not directly related to the lecture’s coverage; I came across it by accident while looking for topics he did talk about and I found it interesting, so I decided to include the link here anyway)
Characteristic polynomial.
Matrix similarity.
Monomial.
Associative algebra.
Polynomial degree bounds for matrix semi-invariants (Derksen & Makam, 2015).
Semi-invariant of a quiver.

July 6, 2019 Posted by | Computer science, Lectures, Mathematics | Leave a comment

On the possibility of an instance-based complexity theory

Below some links related to the lecture’s coverage:

Computational complexity theory.
Minimum cut.
2-satisfiability.
3-SAT.
Worst-case complexity.
Average-case complexity.
Max-Cut.
Karp’s 21 NP-complete problems.
Reduction (complexity).
Levin’s Universal search algorithm – Scholarpedia.
Computational indistinguishability.
Circuit complexity.
Adversarial Perturbations of Deep Neural Networks.
Sherrington–Kirkpatrick model.
Equivalence class.
Hopkins (2018).
Planted clique.
SDP (Semidefinite programming).
Jain, Koehler & Risteski (2018): Mean-field approximation, convex hierarchies, and the optimality of correlation rounding: a unified perspective.
Structural operational semantics (SOS).

July 1, 2019 Posted by | Computer science, Lectures | Leave a comment

Successes and Challenges in Neural Models for Speech and Language

Some links related to the coverage:
Speech recognition.
Machine translation.
Supervised learning.
Parsing.
Context-free grammar.
Kernel Approximation Methods for Speech Recognition.
Convolutional neural network.
Dependency parsing | NLP-progress.
Natural Language Processing (Almost) from Scratch (Collobert et al.).
A Fast and Accurate Dependency Parser using Neural Networks (Chen and Manning, 2014).
Question answering.
Natural Questions: a Benchmark for Question Answering Research (Kwiatkowski et al.)
Attention Is All You Need (Vaswani et al. 2017).
Softmax function.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al.).

May 5, 2019 Posted by | Computer science, Language, Lectures | Leave a comment

Reproducible, Reusable, and Robust Reinforcement Learning

This pdf was created some time before the lecture took place, but it seems to contains all the slides included in the lecture – so if you want a short version of the talk I guess you can read that. I’ve added a few other lecture-relevant links below.

REPRODUCIBILITY, REPLICABILITY, AND GENERALIZATION IN THE SOCIAL, BEHAVIORAL, AND ECONOMIC SCIENCES (Bollen et al. 2015).
1,500 scientists lift the lid on reproducibility (Nature).
Reinforcement learning.
AlphaGo. Libratus.
Adaptive control of epileptiform excitability in an in vitro model of limbic seizures (Panuccio, Guez, Vincent, Avoli and Pineau, 2013)
Deep Reinforcement Learning that Matters (Henderson et al, 2019).
Policy gradient methods.
Hyperparameter (machine learning).
Transfer learning.

May 1, 2019 Posted by | Computer science, Data, Lectures, Statistics | Leave a comment

Artificial intelligence (I?)

This book was okay, but nothing all that special. In my opinion there’s too much philosophy and similar stuff in there (‘what does intelligence really mean anyway?’), and the coverage isn’t nearly as focused on technological aspects as e.g. Winfield’s (…in my opinion better…) book from the same series on robotics (which I covered here) was; I am certain I’d have liked this book better if it’d provided a similar type of coverage as did Winfield, but it didn’t. However it’s far from terrible and I liked the authors skeptical approach to e.g. singularitarianism. Below I have added some quotes and links, as usual.

“Artificial intelligence (AI) seeks to make computers do the sorts of things that minds can do. Some of these (e.g. reasoning) are normally described as ‘intelligent’. Others (e.g. vision) aren’t. But all involve psychological skills — such as perception, association, prediction, planning, motor control — that enable humans and animals to attain their goals. Intelligence isn’t a single dimension, but a richly structured space of diverse information-processing capacities. Accordingly, AI uses many different techniques, addressing many different tasks. […] although AI needs physical machines (i.e. computers), it’s best thought of as using what computer scientists call virtual machines. A virtual machine isn’t a machine depicted in virtual reality, nor something like a simulated car engine used to train mechanics. Rather, it’s the information-processing system that the programmer has in mind when writing a program, and that people have in mind when using it. […] Virtual machines in general are comprised of patterns of activity (information processing) that exist at various levels. […] the human mind can be understood as the virtual machine – or rather, the set of mutually interacting virtual machines, running in parallel […] – that is implemented in the brain. Progress in AI requires progress in defining interesting/useful virtual machines. […] How the information is processed depends on the virtual machine involved. [There are many different approaches.] […] In brief, all the main types of AI were being thought about, and even implemented, by the late 1960s – and in some cases, much earlier than that. […] Neural networks are helpful for modelling aspects of the brain, and for doing pattern recognition and learning. Classical AI (especially when combined with statistics) can model learning too, and also planning and reasoning. Evolutionary programming throws light on biological evolution and brain development. Cellular automata and dynamical systems can be used to model development in living organisms. Some methodologies are closer to biology than to psychology, and some are closer to non-reflective behaviour than to deliberative thought. To understand the full range of mentality, all of them will be needed […]. Many AI researchers [however] don’t care about how minds work: they seek technological efficiency, not scientific understanding. […] In the 21st century, […] it has become clear that different questions require different types of answers”.

“State-of-the-art AI is a many-splendoured thing. It offers a profusion of virtual machines, doing many different kinds of information processing. There’s no key secret here, no core technique unifying the field: AI practitioners work in highly diverse areas, sharing little in terms of goals and methods. […] A host of AI applications exist, designed for countless specific tasks and used in almost every area of life, by laymen and professionals alike. Many outperform even the most expert humans. In that sense, progress has been spectacular. But the AI pioneers weren’t aiming only for specialist systems. They were also hoping for systems with general intelligence. Each human-like capacity they modelled — vision, reasoning, language, learning, and so on — would cover its entire range of challenges. Moreover, these capacities would be integrated when appropriate. Judged by those criteria, progress has been far less impressive. […] General intelligence is still a major challenge, still highly elusive. […] problems can’t always be solved merely by increasing computer power. New problem-solving methods are often needed. Moreover, even if a particular method must succeed in principle, it may need too much time and/or memory to succeed in practice. […] Efficiency is important, too: the fewer the number of computations, the better. In short, problems must be made tractable. There are several basic strategies for doing that. All were pioneered by classical symbolic AI, or GOFAI, and all are still essential today. One is to direct attention to only a part of the search space (the computer’s representation of the problem, within which the solution is assumed to be located). Another is to construct a smaller search space by making simplifying assumptions. A third is to order the search efficiently. Yet another is to construct a different search space, by representing the problem in a new way. These approaches involve heuristics, planning, mathematical simplification, and knowledge representation, respectively. […] Often, the hardest part of AI problem solving is presenting the problem to the system in the first place. […] the information (‘knowledge’) concerned must be presented to the system in a fashion that the machine can understand – in other words, that it can deal with. […] AI’s way of doing this are highly diverse.”

“The rule-baed form of knowledge representation enables programs to be built gradually, as the programmer – or perhaps an AGI system itself – learns more about the domain. A new rule can be added at any time. There’s no need to rewrite the program from scratch. However, there’s a catch. If the new rule isn’t logically consistent with the existing ones, the system won’t always do what it’s supposed to do. It may not even approximate what it’s supposed to do. When dealing with a small set of rules, such logical conflicts are easily avoided, but larger systems are less transparent. […] An alternative form of knowledge representation for concepts is semantic networks […] A semantic network links concepts by semantic relations […] semantic networks aren’t the same thing as neural networks. […] distributed neural networks represent knowledge in a very different way. There, individual concepts are represented not by a single node in a carefully defined associative net, but by the changing patterns of activity across an entire network. Such systems can tolerate conflicting evidence, so aren’t bedevilled by the problems of maintaining logical consistency […] Even a single mind involves distributed cognition, for it integrates many cognitive, motivational, and emotional subsystems […] Clearly, human-level AGI would involve distributed cognition.”

“In short, most human visual achievements surpass today’s AI. Often, AI researchers aren’t clear about what questions to ask. For instance, think about folding a slippery satin dress neatly. No robot can do this (although some can be instructed, step by step, how to fold an oblong terry towel). Or consider putting on a T-shirt: the head must go in first, and not via a sleeve — but why? Such topological problems hardly feature in AI. None of this implies that human-level computer vision is impossible. But achieving it is much more difficult than most people believe. So this is a special case of the fact noted in Chapter 1: that AI has taught us that human minds are hugely richer, and more subtle, than psychologists previously imagined. Indeed, that is the main lesson to be learned from AI. […] Difficult though it is to build a high-performing AI specialist, building an AI generalist is orders of magnitude harder. (Deep learning isn’t the answer: its aficionados admit that ‘new paradigms are needed’ to combine it with complex reasoning — scholarly code for ‘we haven’t got a clue’.) That’s why most AI researchers abandoned that early hope, turning instead to multifarious narrowly defined tasks—often with spectacular success.”

“Some machine learning uses neural networks. But much relies on symbolic AI, supplemented by powerful statistical algorithms. In fact, the statistics really do the work, the GOFAI merely guiding the worker to the workplace. Accordingly, some professionals regard machine learning as computer science and/or statistics —not AI. However, there’s no clear boundary here. Machine learning has three broad types: supervised, unsupervised, and reinforcement learning. […] In supervised learning, the programmer ‘trains’ the system by defining a set of desired outcomes for a range of inputs […], and providing continual feedback about whether it has achieved them. The learning system generates hypotheses about the relevant features. Whenever it classifies incorrectly, it amends its hypothesis accordingly. […] In unsupervised learning, the user provides no desired outcomes or error messages. Learning is driven by the principle that co-occurring features engender expectations that they will co-occur in future. Unsupervised learning can be used to discover knowledge. The programmers needn’t know what patterns/clusters exist in the data: the system finds them for itself […but even though Boden does not mention this fact, caution is most definitely warranted when applying such systems/methods to data (..it remains true that “Truth and true models are not statistically identifiable from data” – as usual, the go-to reference here is Burnham & Anderson)]. Finally, reinforcement learning is driven by analogues of reward and punishment: feedback messages telling the system that what it just did was good or bad. Often, reinforcement isn’t simply binary […] Given various theories of probability, there are many different algorithms suitable for distinct types of learning and different data sets.”

“Countless AI applications use natural language processing (NLP). Most focus on the computer’s ‘understanding’ of language that is presented to it, not on its own linguistic production. That’s because NLP generation is even more difficult than NLP acceptance [I had a suspicion this might be the case before reading the book, but I didn’t know – US]. […] It’s now clear that handling fancy syntax isn’t necessary for summarizing, questioning, or translating a natural-language text. Today’s NLP relies more on brawn (computational power) than on brain (grammatical analysis). Mathematics — specifically, statistics — has overtaken logic, and machine learning (including, but not restricted to, deep learning) has displaced syntactic analysis. […] In modern-day NLP, powerful computers do statistical searches of huge collections (‘corpora’) of texts […] to find word patterns both commonplace and unexpected. […] In general […], the focus is on words and phrases, not syntax. […] Machine-matching of languages from different language groups is usually difficult. […] Human judgements of relevance are often […] much too subtle for today’s NLP. Indeed, relevance is a linguistic/conceptual version of the unforgiving ‘frame problem‘ in robotics […]. Many people would argue that it will never be wholly mastered by a non-human system.”

“[M]any AI research groups are now addressing emotion. Most (not quite all) of this research is theoretically shallow. And most is potentially lucrative, being aimed at developing ‘computer companions’. These are AI systems — some screen-based, some ambulatory robots — designed to interact with people in ways that (besides being practically helpful) are affectively comfortable, even satisfying, for the user. Most are aimed at the elderly and/or disabled, including people with incipient dementia. Some are targeted on babies or infants. Others are interactive ‘adult toys’. […] AI systems can already recognize human emotions in various ways. Some are physiological: monitoring the person’s breathing rate and galvanic skin response. Some are verbal: noting the speaker’s speed and intonation, as well as their vocabulary. And some are visual: analysing their facial expressions. At present, all these methods are relatively crude. The user’s emotions are both easily missed and easily misinterpreted. […] [An] point [point], here [in the development and evaluation of AI], is that emotions aren’t merely feelings. They involve functional, as well as phenomenal, consciousness […]. Specifically, they are computational mechanisms that enable us to schedule competing motives – and without which we couldn’t function. […] If we are ever to achieve AGI, emotions such as anxiety will have to be included – and used.”

[The point made in the book is better made in Aureli et al.‘s book, especially the last chapters to which the coverage in the linked post refer. The point is that emotions enable us to make better decisions, or perhaps even to make a decision in the first place; the emotions we feel in specific contexts will tend not to be even remotely random, rather they will tend to a significant extent to be Nature’s (…and Mr. Darwin’s) attempt to tell us how to handle a specific conflict of interest in the ‘best’ manner. You don’t need to do the math, your forebears did it for you, which is why you’re now …angry, worried, anxious, etc. If you had to do the math every time before you made a decision, you’d be in trouble, and emotions provide a great shortcut in many contexts. The potential for such short-cuts seems really important if you want an agent to act intelligently, regardless of whether said agent is ‘artificial’ or not. The book very briefly mentions a few of Minsky’s thoughts on these topics, and people who are curious could probably do worse than read some of his stuff. This book seems like a place to start.]

Links:

GOFAI (“Good Old-Fashioned Artificial Intelligence”).
Ada Lovelace. Charles Babbage. Alan Turing. Turing machine. Turing test. Norbert WienerJohn von Neumann. W. Ross Ashby. William Grey Walter. Oliver SelfridgeKenneth Craik. Gregory Bateson. Frank Rosenblatt. Marvin Minsky. Seymour Papert.
A logical calculus of the ideas immanent in nervous activity (McCulloch & Pitts, 1943).
Propositional logic. Logic gate.
Arthur Samuel’s checkers player. Logic Theorist. General Problem Solver. The Homeostat. Pandemonium architecture. Perceptron. Cyc.
Fault-tolerant computer system.
Cybernetics.
Programmed Data Processor (PDP).
Artificial life.
Forward chaining. Backward chaining.
Rule-based programming. MYCIN. Dendral.
Semantic network.
Non-monotonic logic. Fuzzy logic.
Facial recognition system. Computer vision.
Bayesian statistics.
Helmholtz machine.
DQN algorithm.
AlphaGo. AlphaZero.
Human Problem Solving (Newell & Simon, 1970).
ACT-R.
NELL (Never-Ending Language Learning).
SHRDLU.
ALPAC.
Google translate.
Data mining. Sentiment analysis. Siri. Watson (computer).
Paro (robot).
Uncanny valley.
CogAff architecture.
Connectionism.
Constraint satisfaction.
Content-addressable memory.
Graceful degradation.
Physical symbol system hypothesis.

January 10, 2019 Posted by | Biology, Books, Computer science, Engineering, Language, Mathematics, Papers, Psychology, Statistics | Leave a comment

Some observations on a cryptographic problem

It’s been a long time since I last posted one of these sort of ‘rootless’ posts which are not based on a specific book or a specific lecture or something along those lines, but a question on r/science made me think about these topics and start writing a bit about it, and I decided I might as well add my thoughts and ideas here.

The reddit question which motivated me to write this post was this one: “Is it difficult to determine the password for an encryption if you are given both the encrypted and unencrypted message?

By “difficult” I mean requiring an inordinate amount of computation. If given both an encrypted and unencrypted file/message, is it reasonable to be able to recover the password that was used to encrypt the file/message?”

Judging from the way the question is worded, the inquirer obviously knows very little about these topics, but that was part of what motivated me when I started out writing; s/he quite obviously has a faulty model of how this kind of stuff actually works, and just by virtue of the way he or she asks his/her question s/he illustrates some ways in which s/he gets things wrong.

When I decided to transfer my efforts towards discussing these topics to the blog I also implicitly decided against using language that would be expected to be easily comprehensible for the original inquirer, as s/he was no longer in the target group and there’s a cost to using that kind of language when discussing technical matters. I’ve sort of tried to make this post both useful and readable to people not all that familiar with the related fields, but I tend to find it difficult to evaluate the extent to which I’ve succeeded when I try to do things like that.

I decided against adding stuff already commented on when I started out writing this, so I’ll not e.g. repeat noiwontfixyourpc’s reply below. However I have added some other observations that seem to me to be relevant and worth mentioning to people who might consider asking a similar question to the one the original inquirer asked in that thread:

i. Finding a way to make plaintext turn into cipher text (…or cipher text into plaintext; and no, these two things are not actually always equivalent, see below…) is a very different (and in many contexts a much easier problem) than finding out the actual encryption scheme that is at work producing the text strings you observe. There can be many, many different ways to go from a specific sample of plaintext to a specific sample of ciphertext, and most of the solutions won’t work if you’re faced with a new piece of ciphertext; especially not if the original samples are small, so only a small amount of (potential) information would be expected to be included in the text strings.

If you only get a small amount of plaintext and corresponding cipher text you may decide that algorithm A is the one that was applied to the message, even if the algorithm actually applied was a more complex algorithm, B. To illustrate in a very simple way how this might happen, A might be a particular case of B, because B is a superset of A and a large number of other potential encryption algorithms applied in the encryption scheme B (…or the encryption scheme C, because B also happens to be a subset of C, or… etc.). In such a context A might be an encryption scheme/approach that perhaps only applies in very specific contexts; for example (part of) the coding algorithm might have been to decide that ‘on next Tuesday, we’ll use this specific algorithm to translate plaintext into cipher text, and we’ll never use that specific translation-/mapping algorithm (which may be but one component of the encryption algorithm) again’. If such a situation applies then you’re faced with the problem that even if your rule ‘worked’ in that particular instance, in terms of translating your plaintext into cipher text and vice versa, it only ‘worked’ because you blindly fitted the two data-sets in a way that looked right, even if you actually had no idea how the coding scheme really worked (you only guessed A, not B, and in this particular instance A’s never actually going to happen again).

On a more general level some of the above comments incidentally in my view quite obviously links to results from classical statistics; there are many ways to link random variables through data fitting methods, but reliably identifying proper causal linkages through the application of such approaches is, well, difficult (and, according to some, often ill-advised)…

ii. In my view, it does not seem possible in general to prove that any specific proposed encryption/decryption algorithm is ‘the correct one’. This is because the proposed algorithm will never be a unique solution to the problem you’re evaluating. How are you going to convince me that The True Algorithm is not a more general/complex one (or perhaps a completely different one – see iii. below) than the one you propose, and that your solution is not missing relevant variables? The only way to truly test if the proposed algorithm is a valid algorithm is to test it on new data and compare its performance on this new data set with the performances of competing variables/solution proposals which also managed to correctly link cipher text and plaintext. If the algorithm doesn’t work on the new data, you got it wrong. If it does work on new data, well, you might still just have been lucky. You might get more confident with more correctly-assessed (…guessed?) data, but you never get certain. In other similar contexts a not uncommon approach for trying to get around these sorts of problems is to limit the analysis to a subset of the data available in order to obtain the algorithm, and then using the rest of the data for validation purposes (here’s a relevant link), but here even with highly efficient estimation approaches you almost certainly will run out of information (/degrees of freedom) long before you get anywhere if the encryption algorithm is at all non-trivial. In these settings information is likely to be a limiting resource.

iii. There are many different types of encryption schemes, and people who ask questions like the one above tend, I believe, to have a quite limited view of which methods and approaches are truly available to one who desires secrecy when exchanging information with others. Imagine a situation where the plaintext is ‘See you next Wednesday’ and the encrypted text is an English translation of Tolstoy’s book War and Peace (or, to make it even more fun, all pages published on the English version of Wikipedia, say on November the 5th, 2017 at midnight GMT). That’s an available encryption approach that might be applied. It might be a part (‘A’) of a more general (‘B’) encryption approach of linking specific messages from a preconceived list of messages, which had been considered worth sending in the future when the algorithm was chosen, to specific book titles decided on in advance. So if you want to say ‘good Sunday!’, Eve gets to read the Bible and see where that gets her. You could also decide that in half of all cases the book cipher text links to specific messages from a list but in the other half of the cases what you actually mean to communicate is on page 21 of the book; this might throw a hacker who saw a combined cipher text and plaintext combination resulting from that part of the algorithm off in terms of the other half, and vice versa – and it illustrates well one of the key problems you’re faced with as an attacker when working on cryptographic schemes about which you have limited knowledge; the opponent can always add new layers on top of the ones that already exist/apply to make the problem harder to solve. And so you could also link the specific list message with some really complicated cipher-encrypted version of the Bible. There’s a lot more to encryption schemes than just exchanging a few letters here and there. On related topics, see this link. On a different if related topic, people who desire secrecy when exchanging information may also attempt to try to hide the fact that any secrets are exchanged in the first place. See also this.

iv. The specific usage of the word ‘password’ in the original query calls for comment for multiple reasons, some of which have been touched upon above, perhaps mainly because it implicitly betrays a lack of knowledge about how modern cryptographic systems actually work. The thing is, even if you might consider an encryption scheme to just be an advanced sort of ‘password’, finding the password (singular) is not always the task you’re faced with today. In symmetric-key algorithm settings you might sort-of-kind-of argue that it sort-of is – in such settings you might say that you have one single (collection of) key(s) which you use to encrypt messages and also use to decrypt the messages. So you can both encrypt and decrypt the message using the same key(s), and so you only have one ‘password’. That’s however not how asymmetric-key encryption works. As wiki puts it: “In an asymmetric key encryption scheme, anyone can encrypt messages using the public key, but only the holder of the paired private key can decrypt.”

This of course relates to what you actually want to do/achieve when you get your samples of cipher text and plaintext. In some cryptographic contexts by design the route you need to to go to get from cipher text to plaintext is conceptually different from the route you need to go to get from plaintext to cipher text. And some of the ‘passwords’ that relate to how the schemes work are public knowledge by design.

v. I have already touched a bit upon the problem of the existence of an information constraint, but I realized I probably need to spell this out in a bit more detail. The original inquirer to me seems implicitly to be under the misapprehension that computational complexity is the only limiting constraint here (“By “difficult” I mean requiring an inordinate amount of computation.”). Given the setting he or she proposes, I don’t think that’s true, and why that is is sort of interesting.

If you think about what kind of problem you’re facing, what you have here in this setting is really a very limited amount of data which relates in an unknown manner to an unknown data-generating process (‘algorithm’). There are, as has been touched upon, in general many ways to obtain linkage between two data sets (the cipher text and the plaintext) using an algorithm – too many ways for comfort, actually. The search space is large, there are too many algorithms to consider; or equivalently, the amount of information supplied by the data will often be too small for us to properly evaluate the algorithms under consideration. An important observation is that more complex algorithms will both take longer to calculate (‘identify’ …at least as candidates) and be expected to require more data to evaluate, at least to the extent that algorithmic complexity constrains the data (/relates to changes in data structure/composition that needs to be modeled in order to evaluate/identify the goal algorithm). If the algorithm says a different encryption rule is at work on Wednesdays, you’re going to have trouble figuring that out if you only got hold of a cipher text/plaintext combination derived from an exchange which took place on a Saturday. There are methods from statistics that might conceivably help you deal with problems like these, but they have their own issues and trade-offs. You might limit yourself to considering only settings where you have access to all known plaintext and cipher text combinations, so you got both Wednesday and Saturday, but even here you can’t be safe – next (metaphorical, I probably at this point need to add) Friday might be different from last (metaphorical) Friday, and this could even be baked into the algorithm in very non-obvious ways.

The above remarks might give you the idea that I’m just coming up with these kinds of suggestions to try to foil your approaches to figuring out the algorithm ‘by cheating’ (…it shouldn’t matter whether or not it was ‘sent on a Saturday’), but the main point is that a complex encryption algorithm is complex, and even if you see it applied multiple times you might not get enough information about how it works from the data suggested to be able to evaluate if you guessed right. In fact, given a combination of a sparse data set (one message, or just a few messages, in plaintext and cipher text) and a complex algorithm involving a very non-obvious mapping function, the odds are strongly against you.

vi. I had the thought that one reason why the inquirer might be confused about some of these things is that s/he might well be aware of the existence of modern cryptographic techniques which do rely to a significant extent on computational complexity aspects. I.e., here you do have settings where you’re asked to provide ‘the right answer’ (‘the password’), but it’s hard to calculate the right answer in a reasonable amount of time unless you have the relevant (private) information at hand – see e.g. these links for more. One way to think about how such a problem relates to the other problem at hand (you have been presented with samples of cipher text and plaintext and you want to guess all the details about how the encryption and decryption schemes which were applied work) is that this kind of algorithm/approach may be applied in combination with other algorithmic approaches to encrypt/decrypt the text you’re analyzing. A really tough prime factorization problem might for all we know be an embedded component of the cryptographic process that is applied to our text. We could call it A.

In such a situation we would definitely be in trouble because stuff like prime factorization is really hard and computationally complex, and to make matters worse just looking at the plaintext and the cipher text would not make it obvious to us that a prime factorization scheme had even been applied to the data. But a really important point is that even if such a tough problem was not present and even if only relatively less computationally demanding problems were involved, we almost certainly still just wouldn’t have enough information to break any semi-decent encryption algorithm based on a small sample of plaintext and cipher text. It might help a little bit, but in the setting contemplated by the inquirer a ‘faster computer’ (/…’more efficient decision algorithm’, etc.) can only help so much.

vii. Shannon and Kerckhoffs may have a point in a general setting, but in specific settings like this particular one I think it is well worth taking into account the implications of not having a (publicly) known algorithm to attack. As wiki notes (see the previous link), ‘Many ciphers are actually based on publicly known algorithms or are open source and so it is only the difficulty of obtaining the key that determines security of the system’. The above remarks were of course all based on an assumption that Eve does not here have the sort of knowledge about the encryption scheme applied that she in many cases today actually might have. There are obvious and well-known weaknesses associated with having security-associated components of a specific cryptographic scheme be independent of the key, but I do not see how it does not in this particular setting cause search space blow-up making the decision problem (did we actually guess right?) intractable in many cases. A key feature of the problem considered by the inquirer is that you here – unlike in many ‘guess the password-settings’ where for example a correct password will allow you access to an application or a document or whatever – do not get any feedback neither in the case where you guess right nor in the case where you guess wrong; it’s a decision problem, not a calculation problem. (However it is perhaps worth noting on the other hand that in a ‘standard guess-the-password-problem’ you may also sometimes implicitly face a similar decision problem due to e.g. the potential for a combination of cryptographic security and steganographic complementary strategies like e.g. these having been applied).

August 14, 2018 Posted by | Computer science, Cryptography, Data, rambling nonsense, Statistics | Leave a comment

Combinatorics (I)

This book is not a particularly easy read, compared to what is the general format of the series in which it is published, but this is a good thing in my view as it also means the author managed to go into enough details in specific contexts to touch upon at least some properties/topics of interest. You don’t need any specific background knowledge to read and understand the book – at least not any sort of background knowledge one would not expect someone who might decide to read a book like this one to already have – but you do need when reading it to have the sort of mental surplus that enables you to think carefully about what’s going on and devote a few mental resources to understanding the details.

Some quotes and links from the first half of the book below.

“The subject of combinatorial analysis or combinatorics […] [w]e may loosely describe [as] the branch of mathematics concerned with selecting, arranging, constructing, classifying, and counting or listing things. […] the subject involves finite sets or discrete elements that proceed in separate steps […] rather than continuous systems […] Mathematicians sometimes use the term ‘combinatorics’ to refer to a larger subset of discrete mathematics that includes graph theory. In that case, what is commonly called combinatorics is then referred to as ‘enumeration’. […] Combinatorics now includes a wide range of topics, some of which we cover in this book, such as the geometry of tilings and polyhedra […], the theory of graphs […], magic squares and latin squares […], block designs and finite projective planes […], and partitions of numbers […]. [The] chapters [of the book] are largely independent of each other and can be read in any order. Much of combinatorics originated in recreational pastimes […] in recent years the subject has developed in depth and variety and has increasingly become a part of mainstream mathematics. […] Undoubtedly part of the reason for the subject’s recent importance has arisen from the growth of computer science and the increasing use of algorithmic methods for solving real-world practical problems. These have led to combinatorial applications in a wide range of subject areas, both within and outside mathematics, including network analysis, coding theory, probability, virology, experimental design, scheduling, and operations research.”

“[C]ombinatorics is primarily concerned with four types of problem:
Existence problem: Does □□□ exist?
Construction problem: If □□□ exists, how can we construct it?
Enumeration problem: How many □□□ are there?
Optimization problem: Which □□□ is best? […]
[T]hese types of problems are not unrelated; for example, the easiest way to prove that something exists may be to construct it explicitly.”

“In this book we consider two types of enumeration problem – counting problems in which we simply wish to know the number of objects involved, and listing problems in which we want to list them all explicitly. […] It’s useful to have some basic counting rules […] In what follows, all the sets are finite. […] In general we have the following rule; here, subsets are disjoint if they have no objects in common: Addition rule: To find the number of objects in a set, split the set into disjoint subsets, count the objects in each subset, and add the results. […] Subtraction rule: If a set of objects can be split into two subsets A and B, then the number of objects in B is obtained by subtracting the number of objects in A from the number in the whole set. […] The subtraction rule extends easily to sets that are split into more than two subsets with no elements in common. […] the inclusion-exclusion principle […] extends this simple idea to the situation where the subsets may have objects in common. […] In general we have the following result: Multiplication rule: If a counting problem can be split into stages with several options at each stage, then the total number of possibilities is the product of options at each stage. […] Another useful principle in combinatorics is the following: Correspondence rule: We can solve a counting problem if we can put the objects to be counted in one-to-one correspondence with the objects of a set that we have already counted. […] We conclude this section with one more rule: Division rule: If a set of n elements can be split into m disjoint subsets, each of size k, then m = n / k.”

“Every algorithm has a running time […] this may be the time that a computer needs to carry out all the necessary calculations, or the actual number of such calculations. Each problem [also] has an input size […] the running time T usually depends on the input size n. Particularly important, because they’re the most efficient, are the polynomial-time algorithms, where the maximum running time is proportional to a power of the input size […] The collection of all polynomial-time algorithms is called P. […] In contrast, there are inefficient algorithms that don’t take polynomial time, such as the exponential-time algorithms […] At this point we introduce NP, the set of ‘non-deterministic polynomial-time problems’. These are algorithms for which a solution, when given, can be checked in polynomial time. Clearly P is contained in NP, since if a problem can be solved in polynomial time then a solution can certainly be checked in polynomial time – checking solutions is far easier than finding them in the first place. But are they the same? […] Few people people believe that the answer is ‘yes’, but no one has been able to prove that P ≠ NP. […] a problem is NP-complete if its solution in polynomial time means that every NP problem can be solved in polynomial time. […] If there were a polynomial algorithm for just one of them, then polynomial algorithms would exist for the whole lot and P would equal NP. On the other hand, if just one of them has no polynomial algorithm, then none of the others could have a polynomial algorithm either, and P would be different from NP.”

“In how many different ways can n objects be arranged? […] generally, we have the following result: Arrangements: The number of arrangements of n objects is n x (n -1) x (n – 2) x … x 3 x 2 x 1. This number is called n factorial and is denoted by n!. […] The word permutation is used in different ways. We’ll use it to mean an ordered selection without repetition, while others may use it to mean an arrangement […] generally, we have the following rule: Ordered selections without repetition (permutations): If we select k items from a set of n objects, and if the selections are ordered and repetition is not allowed, then the number of possible selections is n x (n – 1) x (n – 2) x … x (n – k +1). We denote this expression by P(n,k). […] Since P(n, n) = n x (n -1) x (n – 2) x … x 3 x 2 x 1 = n!, an arrangement is a permutation for which k = n. […] generally, we have the following result: P(n,k) = n! /(n-k)!. […] unordered selections without repetition are called combinations, giving rise to the words combinatorial and combinatorics. […] generally, we have the following result: Unordered selections without repetition (combinations): If we select k items from a set of n objects, and if the selections are unordered and repetition is not allowed, then the number of possible selections is P(n,k)/k! = n x (n-1) x (n-2) x … x (n – k + 1)/k!. We denote this expression by C(n,k) […] Unordered selections with repetition: If we select k items from a set of n objects, and if the selections are unordered and repetition is allowed, then the number of possible selections is C(n + k – 1, k). […] Combination rule 1: For any numbers k and n with n, C(n,k) = C(n,n-k) […] Combination rule 2: For any numbers n and k with n, C(n, n-k) = n!/(n-k)!(n-(n-k))! = n!/(n-k)!k! = C(n,k). […] Combination rule 3: For any number n, C(n,0) + C(n,1) + C(n,2) + … + C(n,n-1) + C(n,n) = 2n

Links:

Tilings/Tessellation.
Knight’s tour.
Seven Bridges of Königsberg problem.
Three utilities problem.
Four color theorem.
Tarry’s algorithm (p.7) (formulated slightly differently in the book, but it’s the same algorithm).
Polyomino.
Arthur Cayley.
Combinatorial principles.
Minimum connector problem.
Travelling salesman problem.
Algorithmic efficiency. Running time/time complexity.
Boolean satisfiability problem. Cook–Levin theorem.
Combination.
Mersenne primes.
Permutation. Factorial. Stirling’s formula.
Birthday problem.
Varāhamihira.
Manhattan distance.
Fibonacci number.
Pascal’s triangle. Binomial coefficient. Binomial theorem.
Pigeonhole principle.
Venn diagram.
Derangement (combinatorial mathematics).
Tower of Hanoi.
Stable marriage problem. Transversal (combinatorics). Hall’s marriage theorem.
Generating function (the topic covered in the book more specifically is related to a symbolic generator of the subsets of a set, but a brief search yielded no good links to this particular topic – US).
Group theory.
Ferdinand Frobenius. Burnside’s lemma.

August 4, 2018 Posted by | Books, Computer science, Mathematics | 1 Comment

Lyapunov Arguments in Optimization

I’d say that if you’re interested in the intersection of mathematical optimization methods/-algorithms and dynamical systems analysis it’s probably a talk well worth watching. The lecture is reasonably high-level and covers a fairly satisfactory amount of ground in a relatively short amount of time, and it is not particularly hard to follow if you have at least some passing familiarity with the fields involved (dynamical systems analysis, statistics, mathematical optimization, computer science/machine learning).

Some links:

Dynamical system.
Euler–Lagrange equation.
Continuous optimization problem.
Gradient descent algorithm.
Lyapunov stability.
Condition number.
Fast (/accelerated-) gradient descent methods.
The Mirror Descent Algorithm.
Cubic regularization of Newton method and its global performance (Nesterov & Polyak).
A Differential Equation for Modeling Nesterov’s Accelerated Gradient Method: Theory and Insights (Su, Boyd & Candès).
A Variational Perspective on Accelerated Methods in Optimization (Wibisono, Wilson & Jordan).
Breaking Locality Accelerates Block Gauss-Seidel (Tu, Venkataraman, Wilson, Gittens, Jordan & Recht).
A Lyapunov Analysis of Momentum Methods in Optimization (Wilson, Recht & Jordan).
Bregman divergence.
Estimate sequence methods.
Variance reduction techniques.
Stochastic gradient descent.
Langevin dynamics.

 

July 22, 2018 Posted by | Computer science, Lectures, Mathematics, Physics, Statistics | Leave a comment

Big Data (II)

Below I have added a few observation from the last half of the book, as well as some coverage-related links to topics of interest.

“With big data, using correlation creates […] problems. If we consider a massive dataset, algorithms can be written that, when applied, return a large number of spurious correlations that are totally independent of the views, opinions, or hypotheses of any human being. Problems arise with false correlations — for example, divorce rate and margarine consumption […]. [W]hen the number of variables becomes large, the number of spurious correlations also increases. This is one of the main problems associated with trying to extract useful information from big data, because in doing so, as with mining big data, we are usually looking for patterns and correlations. […] one of the reasons Google Flu Trends failed in its predictions was because of these problems. […] The Google Flu Trends project hinged on the known result that there is a high correlation between the number of flu-related online searches and visits to the doctor’s surgery. If a lot of people in a particular area are searching for flu-related information online, it might then be possible to predict the spread of flu cases to adjoining areas. Since the interest is in finding trends, the data can be anonymized and hence no consent from individuals is required. Using their five-year accumulation of data, which they limited to the same time-frame as the CDC data, and so collected only during the flu season, Google counted the weekly occurrence of each of the fifty million most common search queries covering all subjects. These search query counts were then compared with the CDC flu data, and those with the highest correlation were used in the flu trends model. […] The historical data provided a baseline from which to assess current flu activity on the chosen search terms and by comparing the new real-time data against this, a classification on a scale from 1 to 5, where 5 signified the most severe, was established. Used in the 2011–12 and 2012–13 US flu seasons, Google’s big data algorithm famously failed to deliver. After the flu season ended, its predictions were checked against the CDC’s actual data. […] the Google Flu Trends algorithm over-predicted the number of flu cases by at least 50 per cent during the years it was used.” [For more details on why blind/mindless hypothesis testing/p-value hunting on big data sets is usually a terrible idea, see e.g. Burnham & Anderson, US]

“The data Google used [in the Google Flu Trends algorithm], collected selectively from search engine queries, produced results [with] obvious bias […] for example by eliminating everyone who does not use a computer and everyone using other search engines. Another issue that may have led to poor results was that customers searching Google on ‘flu symptoms’ would probably have explored a number of flu-related websites, resulting in their being counted several times and thus inflating the numbers. In addition, search behaviour changes over time, especially during an epidemic, and this should be taken into account by updating the model regularly. Once errors in prediction start to occur, they tend to cascade, which is what happened with the Google Flu Trends predictions: one week’s errors were passed along to the next week. […] [Similarly,] the Ebola prediction figures published by WHO [during the West African Ebola virus epidemic] were over 50 per cent higher than the cases actually recorded. The problems with both the Google Flu Trends and Ebola analyses were similar in that the prediction algorithms used were based only on initial data and did not take into account changing conditions. Essentially, each of these models assumed that the number of cases would continue to grow at the same rate in the future as they had before the medical intervention began. Clearly, medical and public health measures could be expected to have positive effects and these had not been integrated into the model.”

“Every time a patient visits a doctor’s office or hospital, electronic data is routinely collected. Electronic health records constitute legal documentation of a patient’s healthcare contacts: details such as patient history, medications prescribed, and test results are recorded. Electronic health records may also include sensor data such as Magnetic Resonance Imaging (MRI) scans. The data may be anonymized and pooled for research purposes. It is estimated that in 2015, an average hospital in the USA will store over 600 Tb of data, most of which is unstructured. […] Typically, the human genome contains about 20,000 genes and mapping such a genome requires about 100 Gb of data. […] The interdisciplinary field of bioinformatics has flourished as a consequence of the need to manage and analyze the big data generated by genomics. […] Cloud-based systems give authorized users access to data anywhere in the world. To take just one example, the NHS plans to make patient records available via smartphone by 2018. These developments will inevitably generate more attacks on the data they employ, and considerable effort will need to be expended in the development of effective security methods to ensure the safety of that data. […] There is no absolute certainty on the Web. Since e-documents can be modified and updated without the author’s knowledge, they can easily be manipulated. This situation could be extremely damaging in many different situations, such as the possibility of someone tampering with electronic medical records. […] [S]ome of the problems facing big data systems [include] ensuring they actually work as intended, [that they] can be fixed when they break down, and [that they] are tamper-proof and accessible only to those with the correct authorization.”

“With transactions being made through sales and auction bids, eBay generates approximately 50 Tb of data a day, collected from every search, sale, and bid made on their website by a claimed 160 million active users in 190 countries. […] Amazon collects vast amounts of data including addresses, payment information, and details of everything an individual has ever looked at or bought from them. Amazon uses its data in order to encourage the customer to spend more money with them by trying to do as much of the customer’s market research as possible. In the case of books, for example, Amazon needs to provide not only a huge selection but to focus recommendations on the individual customer. […] Many customers use smartphones with GPS capability, allowing Amazon to collect data showing time and location. This substantial amount of data is used to construct customer profiles allowing similar individuals and their recommendations to be matched. Since 2013, Amazon has been selling customer metadata to advertisers in order to promote their Web services operation […] Netflix collects and uses huge amounts of data to improve customer service, such as offering recommendations to individual customers while endeavouring to provide reliable streaming of its movies. Recommendation is at the heart of the Netflix business model and most of its business is driven by the data-based recommendations it is able to offer customers. Netflix now tracks what you watch, what you browse, what you search for, and the day and time you do all these things. It also records whether you are using an iPad, TV, or something else. […] As well as collecting search data and star ratings, Netflix can now keep records on how often users pause or fast forward, and whether or not they finish watching each programme they start. They also monitor how, when, and where they watched the programme, and a host of other variables too numerous to mention.”

“Data science is becoming a popular study option in universities but graduates so far have been unable to meet the demands of commerce and industry, where positions in data science offer high salaries to experienced applicants. Big data for commercial enterprises is concerned with profit, and disillusionment will set in quickly if an over-burdened data analyst with insufficient experience fails to deliver the expected positive results. All too often, firms are asking for a one-size-fits-all model of data scientist who is expected to be competent in everything from statistical analysis to data storage and data security.”

“In December 2016, Yahoo! announced that a data breach involving over one billion user accounts had occurred in August 2013. Dubbed the biggest ever cyber theft of personal data, or at least the biggest ever divulged by any company, thieves apparently used forged cookies, which allowed them access to accounts without the need for passwords. This followed the disclosure of an attack on Yahoo! in 2014, when 500 million accounts were compromised. […] The list of big data security breaches increases almost daily. Data theft, data ransom, and data sabotage are major concerns in a data-centric world. There have been many scares regarding the security and ownership of personal digital data. Before the digital age we used to keep photos in albums and negatives were our backup. After that, we stored our photos electronically on a hard-drive in our computer. This could possibly fail and we were wise to have back-ups but at least the files were not publicly accessible. Many of us now store data in the Cloud. […] If you store all your photos in the Cloud, it’s highly unlikely with today’s sophisticated systems that you would lose them. On the other hand, if you want to delete something, maybe a photo or video, it becomes difficult to ensure all copies have been deleted. Essentially you have to rely on your provider to do this. Another important issue is controlling who has access to the photos and other data you have uploaded to the Cloud. […] although the Internet and Cloud-based computing are generally thought of as wireless, they are anything but; data is transmitted through fibre-optic cables laid under the oceans. Nearly all digital communication between continents is transmitted in this way. My email will be sent via transatlantic fibre-optic cables, even if I am using a Cloud computing service. The Cloud, an attractive buzz word, conjures up images of satellites sending data across the world, but in reality Cloud services are firmly rooted in a distributed network of data centres providing Internet access, largely through cables. Fibre-optic cables provide the fastest means of data transmission and so are generally preferable to satellites.”

Links:

Health care informatics.
Electronic health records.
European influenza surveillance network.
Overfitting.
Public Health Emergency of International Concern.
Virtual Physiological Human project.
Watson (computer).
Natural language processing.
Anthem medical data breach.
Electronic delay storage automatic calculator (EDSAC). LEO (computer). ICL (International Computers Limited).
E-commerce. Online shopping.
Pay-per-click advertising model. Google AdWords. Click fraud. Targeted advertising.
Recommender system. Collaborative filtering.
Anticipatory shipping.
BlackPOS Malware.
Data Encryption Standard algorithm. EFF DES cracker.
Advanced Encryption Standard.
Tempora. PRISM (surveillance program). Edward Snowden. WikiLeaks. Tor (anonymity network). Silk Road (marketplace). Deep web. Internet of Things.
Songdo International Business District. Smart City.
United Nations Global Pulse.

July 19, 2018 Posted by | Books, Computer science, Cryptography, Data, Engineering, Epidemiology, Statistics | Leave a comment

Big Data (I?)

Below a few observations from the first half of the book, as well as some links related to the topic coverage.

“The data we derive from the Web can be classified as structured, unstructured, or semi-structured. […] Carefully structured and tabulated data is relatively easy to manage and is amenable to statistical analysis, indeed until recently statistical analysis methods could be applied only to structured data. In contrast, unstructured data is not so easily categorized, and includes photos, videos, tweets, and word-processing documents. Once the use of the World Wide Web became widespread, it transpired that many such potential sources of information remained inaccessible because they lacked the structure needed for existing analytical techniques to be applied. However, by identifying key features, data that appears at first sight to be unstructured may not be completely without structure. Emails, for example, contain structured metadata in the heading as well as the actual unstructured message […] and so may be classified as semi-structured data. Metadata tags, which are essentially descriptive references, can be used to add some structure to unstructured data. […] Dealing with unstructured data is challenging: since it cannot be stored in traditional databases or spreadsheets, special tools have had to be developed to extract useful information. […] Approximately 80 per cent of the world’s data is unstructured in the form of text, photos, and images, and so is not amenable to the traditional methods of structured data analysis. ‘Big data’ is now used to refer not just to the total amount of data generated and stored electronically, but also to specific datasets that are large in both size and complexity, with which new algorithmic techniques are required in order to extract useful information from them.”

“In the digital age we are no longer entirely dependent on samples, since we can often collect all the data we need on entire populations. But the size of these increasingly large sets of data cannot alone provide a definition for the term ‘big data’ — we must include complexity in any definition. Instead of carefully constructed samples of ‘small data’ we are now dealing with huge amounts of data that has not been collected with any specific questions in mind and is often unstructured. In order to characterize the key features that make data big and move towards a definition of the term, Doug Laney, writing in 2001, proposed using the three ‘v’s: volume, variety, and velocity. […] ‘Volume’ refers to the amount of electronic data that is now collected and stored, which is growing at an ever-increasing rate. Big data is big, but how big? […] Generally, we can say the volume criterion is met if the dataset is such that we cannot collect, store, and analyse it using traditional computing and statistical methods. […] Although a great variety of data [exists], ultimately it can all be classified as structured, unstructured, or semi-structured. […] Velocity is necessarily connected with volume: the faster the data is generated, the more there is. […] Velocity also refers to the speed at which data is electronically processed. For example, sensor data, such as that generated by an autonomous car, is necessarily generated in real time. If the car is to work reliably, the data […] must be analysed very quickly […] Variability may be considered as an additional dimension of the velocity concept, referring to the changing rates in flow of data […] computer systems are more prone to failure [during peak flow periods]. […] As well as the original three ‘v’s suggested by Laney, we may add ‘veracity’ as a fourth. Veracity refers to the quality of the data being collected. […] Taken together, the four main characteristics of big data – volume, variety, velocity, and veracity – present a considerable challenge in data management.” [As regular readers of this blog might be aware, not everybody would agree with the author here about the inclusion of veracity as a defining feature of big data – “Many have suggested that there are more V’s that are important to the big data problem [than volume, variety & velocity] such as veracity and value (IEEE BigData 2013). Veracity refers to the trustworthiness of the data, and value refers to the value that the data adds to creating knowledge about a topic or situation. While we agree that these are important data characteristics, we do not see these as key features that distinguish big data from regular data. It is important to evaluate the veracity and value of all data, both big and small. (Knoth & Schmid)]

“Anyone who uses a personal computer, laptop, or smartphone accesses data stored in a database. Structured data, such as bank statements and electronic address books, are stored in a relational database. In order to manage all this structured data, a relational database management system (RDBMS) is used to create, maintain, access, and manipulate the data. […] Once […] the database [has been] constructed we can populate it with data and interrogate it using structured query language (SQL). […] An important aspect of relational database design involves a process called normalization which includes reducing data duplication to a minimum and hence reduces storage requirements. This allows speedier queries, but even so as the volume of data increases the performance of these traditional databases decreases. The problem is one of scalability. Since relational databases are essentially designed to run on just one server, as more and more data is added they become slow and unreliable. The only way to achieve scalability is to add more computing power, which has its limits. This is known as vertical scalability. So although structured data is usually stored and managed in an RDBMS, when the data is big, say in terabytes or petabytes and beyond, the RDBMS no longer works efficiently, even for structured data. An important feature of relational databases and a good reason for continuing to use them is that they conform to the following group of properties: atomicity, consistency, isolation, and durability, usually known as ACID. Atomicity ensures that incomplete transactions cannot update the database; consistency excludes invalid data; isolation ensures one transaction does not interfere with another transaction; and durability means that the database must update before the next transaction is carried out. All these are desirable properties but storing and accessing big data, which is mostly unstructured, requires a different approach. […] given the current data explosion there has been intensive research into new storage and management techniques. In order to store these massive datasets, data is distributed across servers. As the number of servers involved increases, the chance of failure at some point also increases, so it is important to have multiple, reliably identical copies of the same data, each stored on a different server. Indeed, with the massive amounts of data now being processed, systems failure is taken as inevitable and so ways of coping with this are built into the methods of storage.”

“A distributed file system (DFS) provides effective and reliable storage for big data across many computers. […] Hadoop DFS [is] one of the most popular DFS […] When we use Hadoop DFS, the data is distributed across many nodes, often tens of thousands of them, physically situated in data centres around the world. […] The NameNode deals with all requests coming in from a client computer; it distributes storage space, and keeps track of storage availability and data location. It also manages all the basic file operations (e.g. opening and closing files) and controls data access by client computers. The DataNodes are responsible for actually storing the data and in order to do so, create, delete, and replicate blocks as necessary. Data replication is an essential feature of the Hadoop DFS. […] It is important that several copies of each block are stored so that if a DataNode fails, other nodes are able to take over and continue with processing tasks without loss of data. […] Data is written to a DataNode only once but will be read by an application many times. […] One of the functions of the NameNode is to determine the best DataNode to use given the current usage, ensuring fast data access and processing. The client computer then accesses the data block from the chosen node. DataNodes are added as and when required by the increased storage requirements, a feature known as horizontal scalability. One of the main advantages of Hadoop DFS over a relational database is that you can collect vast amounts of data, keep adding to it, and, at that time, not yet have any clear idea of what you want to use it for. […] structured data with identifiable rows and columns can be easily stored in a RDBMS while unstructured data can be stored cheaply and readily using a DFS.”

NoSQL is the generic name used to refer to non-relational databases and stands for Not only SQL. […] The non-relational model has some features that are necessary in the management of big data, namely scalability, availability, and performance. With a relational database you cannot keep scaling vertically without loss of function, whereas with NoSQL you scale horizontally and this enables performance to be maintained. […] Within the context of a distributed database system, consistency refers to the requirement that all copies of data should be the same across nodes. […] Availability requires that if a node fails, other nodes still function […] Data, and hence DataNodes, are distributed across physically separate servers and communication between these machines will sometimes fail. When this occurs it is called a network partition. Partition tolerance requires that the system continues to operate even if this happens. In essence, what the CAP [Consistency, Availability, Partition Tolerance] Theorem states is that for any distributed computer system, where the data is shared, only two of these three criteria can be met. There are therefore three possibilities; the system must be: consistent and available, consistent and partition tolerant, or partition tolerant and available. Notice that since in a RDMS the network is not partitioned, only consistency and availability would be of concern and the RDMS model meets both of these criteria. In NoSQL, since we necessarily have partitioning, we have to choose between consistency and availability. By sacrificing availability, we are able to wait until consistency is achieved. If we choose instead to sacrifice consistency it follows that sometimes the data will differ from server to server. The somewhat contrived acronym BASE (Basically Available, Soft, and Eventually consistent) is used as a convenient way of describing this situation. BASE appears to have been chosen in contrast to the ACID properties of relational databases. ‘Soft’ in this context refers to the flexibility in the consistency requirement. The aim is not to abandon any one of these criteria but to find a way of optimizing all three, essentially a compromise. […] The name NoSQL derives from the fact that SQL cannot be used to query these databases. […] There are four main types of non-relational or NoSQL database: key-value, column-based, document, and graph – all useful for storing large amounts of structured and semi-structured data. […] Currently, an approach called NewSQL is finding a niche. […] the aim of this latent technology is to solve the scalability problems associated with the relational model, making it more useable for big data.”

“A popular way of dealing with big data is to divide it up into small chunks and then process each of these individually, which is basically what MapReduce does by spreading the required calculations or queries over many, many computers. […] Bloom filters are particularly suited to applications where storage is an issue and where the data can be thought of as a list. The basic idea behind Bloom filters is that we want to build a system, based on a list of data elements, to answer the question ‘Is X in the list?’ With big datasets, searching through the entire set may be too slow to be useful, so we use a Bloom filter which, being a probabilistic method, is not 100 per cent accurate—the algorithm may decide that an element belongs to the list when actually it does not; but it is a fast, reliable, and storage efficient method of extracting useful knowledge from data. Bloom filters have many applications. For example, they can be used to check whether a particular Web address leads to a malicious website. In this case, the Bloom filter would act as a blacklist of known malicious URLs against which it is possible to check, quickly and accurately, whether it is likely that the one you have just clicked on is safe or not. Web addresses newly found to be malicious can be added to the blacklist. […] A related example is that of malicious email messages, which may be spam or may contain phishing attempts. A Bloom filter provides us with a quick way of checking each email address and hence we would be able to issue a timely warning if appropriate. […] they can [also] provide a very useful way of detecting fraudulent credit card transactions.”

Links:

Data.
Punched card.
Clickstream log.
HTTP cookie.
Australian Square Kilometre Array Pathfinder.
The Millionaire Calculator.
Data mining.
Supervised machine learning.
Unsupervised machine learning.
Statistical classification.
Cluster analysis.
Moore’s Law.
Cloud storage. Cloud computing.
Data compression. Lossless data compression. Lossy data compression.
ASCII. Huffman algorithm. Variable-length encoding.
Data compression ratio.
Grayscale.
Discrete cosine transform.
JPEG.
Bit array. Hash function.
PageRank algorithm.
Common crawl.

July 14, 2018 Posted by | Books, Computer science, Data, Statistics | Leave a comment

Robotics

“This book is not about the psychology or cultural anthropology of robotics, interesting as those are. I am an engineer and roboticist, so I confine myself firmly to the technology and application of real physical robots. […] robotics is the study of the design, application, and use of robots, and that is precisely what this Very Short Introduction is about: what robots do and what roboticists do.”

The above quote is from the book‘s preface; the book is quite decent and occasionally really quite fascinating. Below I have added some sample quotes and links to topics/stuff covered in the book.

“Some of all of […] five functions – sensing, signalling, moving, intelligence, and energy, integrated into a body – are present in all robots. The actual sensors, motors, and behaviours designed into a particular robot body shape depend on the job that robot is designed to do. […] A robot is: 1. an artificial device that can sense its environment and purposefully act on or in that environment; 2. an embodied artificial intelligence; or 3. a machine that can autonomously carry out useful work. […] Many real-world robots […] are not autonomous but remotely operated by humans. […] These are also known as tele-operated robots. […] From a robot design point of view, the huge advantage of tele-operated robots is that the human in the loop provides the robot’s ‘intelligence’. One of the most difficult problems in robotics — the design of the robot’s artificial intelligence — is therefore solved, so it’s not surprising that so many real-world robots are tele-operated. The fact that tele-operated robots alleviate the problem of AI design should not fool us into making the mistake of thinking that tele-operated robots are not sophisticated — they are. […] counter-intuitively, autonomous robots are often simpler than tele-operated robots […] When roboticists talk about autonomous robots they normally mean robots that decide what to do next entirely without human intervention or control. We need to be careful here because they are not talking about true autonomy, in the sense that you or I would regard ourselves as self-determining individuals, but what I would call ‘control autonomy’. By control autonomy I mean that the robot can undertake its task, or mission, without human intervention, but that mission is still programmed or commanded by a human. In fact, there are very few robots in use in the real world that are autonomous even in this limited sense. […] It is helpful to think about a spectrum of robot autonomy, from remotely operated at one end (no autonomy) to fully autonomous at the other. We can then place robots on this spectrum according to their degree of autonomy. […] On a scale of autonomy, a robot that can react on its own in response to its sensors is highly autonomous. A robot that cannot react, perhaps because it doesn’t have any sensors, is not.”

“It is […] important to note that autonomy and intelligence are not the same thing. A robot can be autonomous but not very smart, like a robot vacuum cleaner. […] A robot vacuum cleaner has a small number of preprogrammed (i.e. instinctive) behaviours and is not capable of any kind of learning […] These are characteristics we would associate with very simple animals. […] When roboticists describe a robot as intelligent, what they mean is ‘a robot that behaves, in some limited sense, as if it were intelligent’. The words as if are important here. […] There are basically two ways in which we can make a robot behave as if it is more intelligent: 1. preprogram a larger number of (instinctive) behaviours; and/or 2. design the robot so that it can learn and therefore develop and grow its own intelligence. The first of these approaches is fine, providing that we know everything there is to know about what the robot must do and all of the situations it will have to respond to while it is working. Typically we can only do this if we design both the robot and its operational environment. […] For unstructured environments, the first approach to robot intelligence above is infeasible simply because it’s impossible to anticipate every possible situation a robot might encounter, especially if it has to interact with humans. The only solution is to design a robot so that it can learn, either from its own experience or from humans or other robots, and therefore adapt and develop its own intelligence: in effect, grow its behavioural repertoire to be able to respond appropriately to more and more situations. This brings us to the subject of learning robots […] robot learning or, more generally, ‘machine learning’ — a branch of AI — has proven to be very much harder than was expected in the early days of Artificial Intelligence.”

“Robot arms on an assembly line are typically programmed to go through a fixed sequence of moves over and over again, for instance spot-welding car body panels, or spray-painting the complete car. These robots are therefore not intelligent. In fact, they often have no exteroceptive sensors at all. […] when we see an assembly line with multiple robot arms positioned on either side along a line, we need to understand that the robots are part of an integrated automated manufacturing system, in which each robot and the line itself have to be carefully programmed in order to coordinate and choreograph the whole operation. […] An important characteristic of assembly-line robots is that they require the working environment to be designed for and around them, i.e. a structured environment. They also need that working environment to be absolutely predictable and repeatable. […] Robot arms either need to be painstakingly programmed, so that the precise movement required of each joint is worked out and coded into a set of instructions for the robot arm or, more often (and rather more easily), ‘taught’ by a human using a control pad to move its end-effector (hand) to the required positions in the robot’s workspace. The robot then memorizes the set of joint movements so that they can be replayed (over and over again). The human operator teaching the robot controls the trajectory, i.e. the path the robot arm’s end-effector follows as it moves through its 3D workspace, and a set of mathematical equations called the ‘inverse kinematics’ converts the trajectory into a set of individual joint movements. Using this approach, it is relatively easy to teach a robot arm to pick up an object and move it smoothly to somewhere else in its workspace while keeping the object level […]. However […] most real-world robot arms are unable to sense the weight of the object and automatically adjust accordingly. They are simply designed with stiff enough joints and strong enough motors that, whatever the weight of the object (providing it’s within the robot’s design limits), it can be lifted, moved, and placed with equal precision. […] The robot arm and gripper are a foundational technology in robotics. Not only are they extremely important as […] industrial assembly-line robot[s], but they have become a ‘component’ in many areas of robotics.”

Planetary rovers are tele-operated mobile robots that present the designer and operator with a number of very difficult challenges. One challenge is power: a planetary rover needs to be energetically self-sufficient for the lifetime of its mission, and must either be launched with a power source or — as in the case of the Mars rovers — fitted with solar panels capable of recharging the rover’s on-board batteries. Another challenge is dependability. Any mechanical fault is likely to mean the end of the rover’s mission, so it needs to be designed and built to exceptional standards of reliability and fail-safety, so that if parts of the rover should fail, the robot can still operate, albeit with reduced functionality. Extremes of temperature are also a problem […] But the greatest challenge is communication. With a round-trip signal delay time of twenty minutes to Mars and back, tele-operating the rover in real time is impossible. If the rover is moving and its human operator in the command centre on Earth reacts to an obstacle, it’s likely to be already too late; the robot will have hit the obstacle by the time the command signal to turn reaches the rover. An obvious answer to this problem would seem to be to give the rover a degree of autonomy so that it could, for instance, plan a path to a rock or feature of interest — while avoiding obstacles — then, when it arrives at the point of interest, call home and wait. Although path-planning algorithms capable of this level of autonomy have been well developed, the risk of a failure of the algorithm (and hence perhaps the whole mission) is deemed so high that in practice the rovers are manually tele-operated, at very low speed, with each manual manoeuvre carefully planned. When one also takes into account the fact that the Mars rovers are contactable only for a three-hour window per Martian day, a traverse of 100 metres will typically take up one day of operation at an average speed of 30 metres per hour.”

“The realization that the behaviour of an autonomous robot is an emergent property of its interactions with the world has important and far-reaching consequences for the way we design autonomous robots. […] when we design robots, and especially when we come to decide what behaviours to programme the robot’s AI with, we cannot think about the robot on its own. We must take into account every detail of the robot’s working environment. […] Like all machines, robots need power. For fixed robots, like the robot arms used for manufacture, power isn’t a problem because the robot is connected to the electrical mains supply. But for mobile robots power is a huge problem because mobile robots need to carry their energy supply around with them, with problems of both the size and weight of the batteries and, more seriously, how to recharge those batteries when they run out. For autonomous robots, the problem is acute because a robot cannot be said to be truly autonomous unless it has energy autonomy as well as computational autonomy; there seems little point in building a smart robot that ‘dies’ when its battery runs out. […] Localization is a[nother] major problem in mobile robotics; in other words, how does a robot know where it is, in 2D or 3D space. […] [One] type of robot learning is called reinforcement learning. […] it is a kind of conditioned learning. If a robot is able to try out several different behaviours, test the success or failure of each behaviour, then ‘reinforce’ the successful behaviours, it is said to have reinforcement learning. Although this sounds straightforward in principle, it is not. It assumes, first, that a robot has at least one successful behaviour in its list of behaviours to try out, and second, that it can test the benefit of each behaviour — in other words, that the behaviour has an immediate measurable reward. If a robot has to try every possible behaviour or if the rewards are delayed, then this kind of so-called ‘unsupervised’ individual robot learning is very slow.”

“A robot is described as humanoid if it has a shape or structure that to some degree mimics the human form. […] A small subset of humanoid robots […] attempt a greater degree of fidelity to the human form and appearance, and these are referred to as android. […] It is a recurring theme of this book that robot intelligence technology lags behind robot mechatronics – and nowhere is the mismatch between the two so starkly evident as it is in android robots. The problem is that if a robot looks convincingly human, then we (not unreasonably) expect it to behave like a human. For this reason whole-body android robots are, at the time of writing, disappointing. […] It is important not to overstate the case for humanoid robots. Without doubt, many potential applications of robots in human work- or living spaces would be better served by non-humanoid robots. The humanoid robot to use human tools argument doesn’t make sense if the job can be done autonomously. It would be absurd, for instance, to design a humanoid robot in order to operate a vacuum cleaner designed for humans. Similarly, if we want a driverless car, it doesn’t make sense to build a humanoid robot that sits in the driver’s seat. It seems that the case for humanoid robots is strongest when the robots are required to work alongside, learn from, and interact closely with humans. […] One of the most compelling reasons why robots should be humanoid is for those applications in which the robot has to interact with humans, work in human workspaces, and use tools or devices designed for humans.”

“…to put it bluntly, sex with a robot might not be safe. As soon as a robot has motors and moving parts, then assuring the safety of human-robot interaction becomes a difficult problem and if that interaction is intimate, the consequences of a mechanical or control systems failure could be serious.”

“All of the potential applications of humanoid robots […] have one thing in common: close interaction between human and robot. The nature of that interaction will be characterized by close proximity and communication via natural human interfaces – speech, gesture, and body language. Human and robot may or may not need to come into physical contact, but even when direct contact is not required they will still need to be within each other’s body space. It follows that robot safety, dependability, and trustworthiness are major issues for the robot designer. […] making a robot safe isn’t the same as making it trustworthy. One person trusts another if, generally speaking, that person is reliable and does what they say they will. So if I were to provide a robot that helps to look after your grandmother and I claim that it is perfectly safe — that it’s been designed to cover every risk or hazard — would you trust it? The answer is probably not. Trust in robots, just as in humans, has to be earned. […for more on these topics, see this post – US] […] trustworthiness cannot just be designed into the robot — it has to be earned by use and by experience. Consider a robot intended to fetch drinks for an elderly person. Imagine that the person calls for a glass of water. The robot then needs to fetch the drink, which may well require the robot to find a glass and fill it with water. Those tasks require sensing, dexterity, and physical manipulation, but they are problems that can be solved with current technology. The problem of trust arises when the robot brings the glass of water to the human. How does the robot give the glass to the human? If the robot has an arm so that it can hold out the glass in the same way a human would, how would the robot know when to let go? The robot clearly needs sensors in order to see and feel when the human has taken hold of the glass. The physical process of a robot handing something to a person is fraught with difficulty. Imagine, for instance, that the robot holds out its arm with the glass but the human can’t reach the glass. How does the robot decide where and how far it would be safe to bring its arm toward the person? What if the human takes hold of the glass but then the glass slips; does the robot let it fall or should it — as a human would — renew its grip on the glass? At what point would the robot decide the transaction has failed: it can’t give the glass of water to the person, or they won’t take it; perhaps they are asleep, or simply forgotten they wanted a glass of water, or confused. How does the robot sense that it should give up and perhaps call for assistance? These are difficult problems in robot cognition. Until they are solved, it’s doubtful we could trust a robot sufficiently well to do even a seemingly simple thing like handing over a glass of water.”

“The fundamental problem with Asimov’s laws of robotics, or any similar construction, is that they require the robot to make judgments. […] they assume that the robot is capable of some level of moral agency. […] No robot that we can currently build, or will build in the foreseeable future, is ‘intelligent’ enough to be able to even recognize, let alone make, these kinds of choices. […] Most roboticists agree that for the foreseeable future robots cannot be ethical, moral agents. […] precisely because, as we have seen, present-day ‘intelligent’ robots are not very intelligent, there is a danger of a gap between what robot users believe those robots to be capable of and what they are actually capable of. Given humans’ propensity to anthropomorphize and form emotional attachments to machines, there is clearly a danger that such vulnerabilities could be either unwittingly or deliberately exploited. Although robots cannot be ethical, roboticists should be.”

“In robotics research, the simulator has become an essential tool of the roboticist’s trade. The reason for this is that designing, building, and testing successive versions of real robots is both expensive and time-consuming, and if part of that work can be undertaken in the virtual rather than the real world, development times can be shortened, and the chances of a robot that works first time substantially improved. A robot simulator has three essential features. First, it must provide a virtual world. Second, it must offer a facility for creating a virtual model of the real robot. And third, it must allow the robot’s controller to be installed and ‘run’ on the virtual robot in the virtual world; the controller then determines how the robot behaves when running in the simulator. The simulator should also provide a visualization of the virtual world and simulated robots in it so that the designer can see what’s going on. […] These are difficult challenges for developers of robot simulators.”

“The next big step in miniaturization […] requires the solution of hugely difficult problems and, in all likelihood, the use of exotic approaches to design and fabrication. […] It is impossible to shrink mechanical and electrical components, or MEMS devices, in order to reduce total robot size to a few micrometres. In any event, the physics of locomotion through a fluid changes at the microscale and simply shrinking mechanical components from macro to micro — even if it were possible — would fail to address this problem. A radical approach is to leave behind conventional materials and components and move to a bioengineered approach in which natural bacteria are modified by adding artificial components. The result is a hybrid of artificial and natural (biological) components. The bacterium has many desirable properties for a microbot. By selecting a bacterium with a flagellum, we have locomotion perfectly suited to the medium. […] Another hugely desirable characteristic is that the bacteria are able to naturally scavenge for energy, thus avoiding the otherwise serious problem of powering the microbots. […] Whatever technology is used to create the microbots, huge problems would have to be overcome before a swarm of medical microbots could become a practical reality. The first is technical: how do surgeons or medical technicians reliably control and monitor the swarm while it’s working inside the body? Or, assuming we can give the microbots sufficient intelligence and autonomy (also a very difficult challenge), do we forgo precise control and human intervention altogether by giving the robots the swarm intelligence to be able to do the job, i.e. find the problem, fix it, then exit? […] these questions bring us to what would undoubtedly represent the greatest challenge: validating the swarm of medical microbots as effective, dependable, and above all safe, then gaining approval and public acceptance for its use. […] Do we treat the validation of the medical microbot swarm as an engineering problem, and attempt to apply the same kinds of methods we would use to validate safety-critical systems such as air traffic control systems? Or do we instead regard the medical microbot swarm as a drug and validate it with conventional and (by and large) trusted processes, including clinical trials, leading to approval and licensing for use? My suspicion is that we will need a new combination of both approaches.”

Links:

E-puck mobile robot.
Jacques de Vaucanson’s Digesting Duck.
Cybernetics.
Alan Turing. W. Ross Ashby. Norbert Wiener. Warren McCulloch. William Grey Walter.
Turtle (robot).
Industrial robot. Mechanical arm. Robotic arm. Robot end effector.
Automated guided vehicle.
Remotely operated vehicle. Unmanned aerial vehicle. Remotely operated underwater vehicle. Wheelbarrow (robot).
Robot-assisted surgery.
Lego Mindstorms NXT. NXT Intelligent Brick.
Biomimetic robots.
Artificial life.
Braitenberg vehicle.
Shakey the robot. Sense-Plan-Act. Rodney Brooks. A robust layered control system for a mobile robot.
Toto the robot.
Slugbot. Ecobot. Microbial fuel cell.
Scratchbot.
Simultaneous localization and mapping (SLAM).
Programming by demonstration.
Evolutionary algorithm.
NASA Robonaut. BERT 2. Kismet (robot). Jules (robot). Frubber. Uncanny valley.
AIBO. Paro.
Cronos Robot. ECCEROBOT.
Swarm robotics. S-bot mobile robot. Swarmanoid project.
Artificial neural network.
Symbrion.
Webots.
Kilobot.
Microelectromechanical systems. I-SWARM project.
ALICE (Artificial Linguistic Internet Computer Entity). BINA 48 (Breakthrough Intelligence via Neural Architecture 48).

June 15, 2018 Posted by | Books, Computer science, Engineering, Medicine | Leave a comment

Mathematics in Cryptography III

As she puts it herself, most of this lecture [~first 47 minutes or so] was basically “an explanation by a non-expert on how the internet uses public key” (-cryptography). The last 20 minutes cover, again in her own words, “more theoretical aspects”.

Some links:

ARPANET.
NSFNET.
Hypertext Transfer Protocol (HTTP). HTTPS.
Project Athena. Kerberos (protocol).
Pretty Good Privacy (PGP).
Secure Sockets Layer (SSL)/Transport Layer Security (TLS).
IPsec.
Wireshark.
Cipher suite.
Elliptic Curve Digital Signature Algorithm (ECDSA).
Request for Comments (RFC).
Elliptic-curve Diffie–Hellman (ECDH).
The SSL/TLS Handshake: an Overview.
Advanced Encryption Standard.
Galois/Counter Mode.
XOR gate.
Hexadecimal.
IP Header.
Time to live (TTL).
Transmission Control Protocol. TCP segment structure.
TLS record.
Security level.
Birthday problem. Birthday attack.
Handbook of Applied Cryptography (Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone). (§3.6 in particular is mentioned/referenced as this is stuff she talks about in the last ‘theoretical’ part of the lecture).

 

June 8, 2018 Posted by | Computer science, Cryptography, Lectures, Mathematics | Leave a comment

Mathematics in Cryptography II

Some links to stuff covered in the lecture:

Public-key cryptography.
New Directions in Cryptography (Diffie & Hellman, 1976).
The history of Non-Secret Encryption (James Ellis).
Note on “Non-Secret Encryption” – Cliff Cocks (1973).
RSA (cryptosystem).
Discrete Logarithm Problem.
Diffie–Hellman key exchange.
AES (Advanced Encryption Standard).
Triple DES.
Trusted third party (TTP).
Key management.
Man-in-the-middle attack.
Digital signature.
Public key certificate.
Secret sharing.
Hash function. Cryptographic hash function.
Secure Hash Algorithm 2 (SHA-2).
Non-repudiation (digital security).
L-notation. L (complexity).
ElGamal signature scheme.
Digital Signature Algorithm (DSA).
Schnorr signature.
Identity-based cryptography.
Identity-Based Cryptosystems and Signature Schemes (Adi Shamir, 1984).
Algorithms for Quantum Computation: Discrete Logarithms and Factoring (Peter Shor, 1994).
Quantum resistant cryptography.
Elliptic curve. Elliptic-curve cryptography.
Projective space.

I have included very few links relating to the topics covered in the last part of the lecture. This was deliberate and not just a result of the type of coverage included in that part of the lecture. In my opinion non-mathematicians should probably skip the last 25 minutes or so as they’re – not only due to technical issues (the lecturer is writing stuff on the blackboard and for several minutes you’re unable to see what she’s writing, which is …unfortunate), but those certainly were not helping – not really worth the effort. The first hour of the lecture is great, the last 25 minutes are, well, less great, in my opinion. You should however not miss the first part of the coverage of ECC-related stuff (in particular the coverage ~55-58 minutes in), if you’re interested in making sense of how ECC works; I certainly found that part of the coverage very helpful.

June 2, 2018 Posted by | Computer science, Cryptography, Lectures, Mathematics, Papers | Leave a comment

Computers, People and the Real World

“An exploration of some spectacular failures of modern day computer-aided systems, which fail to take into account the real-world […] Almost nobody wants an IT system. What they want is a better way of doing something, whether that is buying and selling shares on the Stock Exchange, flying an airliner or running a hospital. So the system they want will usually involve changes to the way people work, and interactions with physical objects and the environment. Drawing on examples including the new programme for IT in the NHS, this lecture explores what can go wrong when business change is mistakenly viewed as an IT project.” (Quote from the video description on youtube).

Some links related to the lecture coverage:
Computer-aided dispatch.
London Ambulance Service – computerization.
Report of the Inquiry Into The London Ambulance Service (February 1993).
Sociotechnical system.
Tay (bot).

A few observations/quotes from the lecture (-notes):

The bidder who least understands the complexity of a requirement is likely to put in the lowest bid.
“It is a mistake to use a computer system to impose new work processes on under-trained or reluctant staff. – Front line staff are often best placed to judge what is practical.” [A quote from later in the lecture [~36 mins] is even more explicit: “The experts in any work process are usually the people who have been carrying it out.”]
“It is important to understand that in any system implementation the people factor is as important, and arguably more important, than the technical infrastructure.” (This last one is a full quote from the report linked above; the lecture includes a shortened version – US) [Quotes and observations above from ~16 minute mark unless otherwise noted]

“There is no such thing as an IT project”
“(almost) every significant “IT Project” is actually a business change project that is enabled and supported by one or more IT systems. Business processes are expensive to change. The business changes take at least as long and cost as much as the new IT system, and need at least as much planning and management” [~29 mins]

“Software packages are packaged business processes
*Changing a package to fit the way you want to work can cost more than writing new software” [~31-32 mins]

“Most computer systems interact with people: the sociotechnical view is that the people and the IT are two components of a larger system. Designing that larger system is the real task.” [~36 mins]

May 31, 2018 Posted by | Computer science, Economics, Engineering, Lectures | Leave a comment

On the cryptographic hardness of finding a Nash equilibrium

I found it annoying that you generally can’t really hear the questions posed by the audience (which includes people like Avi Wigderson), especially considering that there are quite a few of these, especially in the middle section of the lecture. There are intermittent issues with the camera’s focus occasionally throughout the talk, but those are all transitory problems that should not keep you from watching the lecture. The sound issue at the beginning of the talk is resolved after 40 seconds.

One important take-away from this talk, if you choose not to watch it: “to date, there is no known efficient algorithm to find Nash equilibrium in games”. In general this paper – coauthored by the lecturer – seems from a brief skim to cover many of the topics also included in the lecture. I have added some other links to articles and topics covered/mentioned in the lecture below.

Nash’s Existence Theorem.
Reducibility Among Equilibrium Problems (Goldberg & Papadimitriou).
Three-Player Games Are Hard (Daskalakis & Papadimitriou).
3-Nash is PPAD-Complete (Chen & Deng).
PPAD (complexity).
NP-hardness.
On the (Im)possibility of Obfuscating Programs (Barak et al.).
On the Impossibility of Obfuscation with Auxiliary Input (Goldwasser & Kalai).
On Best-Possible Obfuscation (Goldwasser & Rothblum).
Functional Encryption without Obfuscation (Garg et al.).
On the Complexity of the Parity Argument and Other Inefficient Proofs of Existence (Papadimitriou).
Pseudorandom function family.
Revisiting the Cryptographic Hardness of Finding a Nash Equilibrium (Garg, Pandei & Srinivasan).
Constrained Pseudorandom Functions and Their Applications (Boneh & Waters).
Delegatable Pseudorandom Functions and Applications (Kiayias et al.).
Functional Signatures and Pseudorandom Functions (Boyle, Goldwasser & Ivan).
Universal Constructions and Robust Combiners for Indistinguishability Obfuscation and Witness Encryption (Ananth et al.).

April 18, 2018 Posted by | Computer science, Cryptography, Game theory, Lectures, Mathematics, Papers | Leave a comment

The Internet of Things

 

Some links to stuff he talks about in the lecture:

The Internet of Things: making the most of the Second Digital Revolution – A report by the UK Government Chief Scientific Adviser.
South–North Water Transfer Project.
FDA approves first smart pill that tracks drug regimen compliance from the inside.
The Internet of Things (IoT)* units installed base by category from 2014 to 2020.
Share of the IoT market by sub-sector worldwide in 2017.
San Diego to Cover Half the City with Intelligent Streetlights.
IPv4 and IPv6 (specifically, he talks a little about the IPv4 address space problem).
General Data Protection Regulation (GDPR).
Shodan (website).
Mirai botnet.
Gait analysis.
Website reveals 73,000 unprotected security cameras with default passwords. (This was just an example link – it’s unclear if the site he used to illustrate his point in that part of the lecture was actually Insecam, but he does talk about the widespread use of default passwords and related security implications during the lecture).
Strava’s fitness heatmaps are a ‘potential catastrophe’.
‘Secure by Design’ (a very recently published proposed UK IoT code of practice).

March 26, 2018 Posted by | Computer science, Engineering, Lectures | Leave a comment