Monday, March 29, 2010

People mining ;-)

data people mining - could be used for good or bad purposes just like everything else in life.


Sunday, March 28, 2010

Securing systems dealing with sensitive information

I went through the executive summary of the audit report of a popular clinical information system in Canada which assessed the security measures in place. The 10 recommendations the report make are quite useful when implementing any access controlled information system; they are not new, but rather well-known facts (need-to-know, defense-in-depth, leakage-prevention, auditing, etc) but in practice largely neglected.

Friday, March 26, 2010

SEO poisoning on the rise

The people who push malware love to trap victims via search. Security companies refer to what they do as "SEO (Search Engine Optimization) poisoning." They identify popular search terms, figure out which ones are likely to bring them suitable targets, and then optimize pages so engines like Google and Bing display their results on the first page -- mixed in amongst the non-malicious pages you actually wanted to find.

So what search words are most likely to get you into trouble? Bearshare (46% malicious sites) and screensaver (42% malicious sites).
The blog post here gives an idea of what kinds of black hat SEO techniques are frequently employed by cyber criminals.
Search engine optimization (SEO) is a collection of techniques used to achieve higher search rankings for a given website. "Black hat SEO" is the method of using unethical SEO techniques in order to obtain a higher search ranking. These techniques include things like keyword stuffing, cloaking, and link farming, which are used to "game" the search engine algorithms.
Cyber criminals also exploits the current hot news (celebrity affairs, death, etc.) at any given time to have search results for malicious pages with high ranks as people are likely to search for such news.

It is a good idea to make your web sites xss safe. If you are a PHP developer, htmlspecialchars and htmlentities are two very useful functions in this regard.

If you are a user, think before you click!

Wednesday, March 24, 2010

Learning/thinking by analogies

For people with a computer science background (but not limited to), working with/thinking in analogies is part of life. For example, take design patterns Adapter, Bridge, Observer, Factory, etc.; they are all analogies. Analogies help us understand/solve the problem at hand.

I found the following analogy appeared in an article in ACM Communications March 2010 issue interesting:

Alice owns a jewelry store. She has raw precious materials—gold, diamonds, silver, etc.—that she wants her workers to assemble into intricately designed rings and necklaces. But she distrusts her workers and assumes that they will steal her jewels if given the opportunity. In other words, she wants her workers to process the materials into finished pieces, without giving them access to the materials. What does she do?

Here is her plan. She uses a transparent impenetrable glovebox, secured by a lock for which only she has the key. She puts the raw precious materials inside the box, locks it, and gives it to a worker. Using the gloves, the worker assembles the ring or necklace inside the box. Since the box is impenetrable, the worker cannot get to the precious materials, and figures he might as well return the box to Alice, with the finished piece inside. Alice unlocks the box with her key and extracts the ring or necklace. In short, the worker processes the raw materials into a finished piece, without having true access to the materials.

Cryptographically speaking ;), this is what we try to achieve with computation over encrypted data! (Note: this analogy does NOT fully represent this goal as the authors themselves point out)

Monday, March 22, 2010

rebuff huff 'n puff

(decoded title: say no to smoking)

How will the healthcare bill affect medicine?

(The traditional way of managing medical records)
From "10 things you need to know about the healthcare bill":

The bill includes incentives to use more electronic medical records, which should make healthcare more efficient and effective. It would set up pilot programs for medical malpractice tort reform. Community health clinics, which help serve people who often don't have access to other forms of care, would get more funding. Medicare payments would be linked to quality of care, which should shift more providers toward evidence-based standards to see how well treatments work.

Other pilot programs would be set up to study how to improve public health in general, and improve care for people with chronic diseases, rural patients and other groups. The goal is to improve the quality of care while holding the costs down.

Friday, March 19, 2010


When the snow vanishes from the ground
And no cold breeze is to be found
The feeling of thankfulness is profound
As I know that the spring is around
The corner with fresh hope
And I feel like nothing is out of my scope
Trees will slowly and surely start to blossom
Reminding me how awesome
It is to be alive
And a convertible can I drive :)

Wednesday, March 17, 2010

To friend or not to

I don't mean to be paranoid here, but you better think twice before you become friend with someone in a social network.

It may be an undercover agent that you are accepting as a friend; this could lead to privacy violations if you are an innocent party.
Law enforcement agents are following the rest of the Internet world into popular social-networking services, even going undercover with false online profiles to communicate with suspects and gather private information, according to an internal Justice Department document that surfaced in a lawsuit.

Want to know how they do it and what they can obtain? read up here.
I don't mind if they use social networks to uncover only those who did something wrong or really questionable, but it would be naive for me to think so.

Facebook's rules, for example, specify that users "will not provide any false personal information on Facebook, or create an account for anyone other than yourself without permission." Twitter's rules prohibit users from sending deceptive or false information. MySpace requires that information for accounts be "truthful and accurate."
I am confused now; can I prosecute an undercover agent on the above ground?

It may be someone impersonating someone else for totally different reason:
Around September 20, 2006, Lori Drew created the Myspace account for the "Josh Evans" alias. At the time Drew operated the Josh Evans MySpace account, she was aware that Meier had been taking antidepressant medication. Meier committed suicide as a result of the bullying.
It may be someone who tries to defame you by associating you with something that you are not. For example, tagging you in an image that is not socially acceptable or writing defamatory/incorrect remarks about you on your wall.

How do you know if a person is who he/she claims to be in a social network? Well, there's no formula for that. But it is in general a good idea to check the mutual friends a person has before accepting the request. It may not work in some cases. What if some of your friends have already been fooled to be friends with that person? (which I have encountered at least a few times already)

How privacy vanishes online and some thoughts

Very timely article:
"If a stranger came up to you, would you say your email address, your phone number?
If you have a not so close friend would you tell your DoB to him/her?
Probably not..yet people say it on the Internet."

“Personal privacy is no longer an individual thing: In today’s online world, what your mother told you is true, only more so: people really can judge you by your friends.”

As the article also briefly mentions, you may think that innocuous attributes such as where you work, your current location, where and what you studied, etc. will not lead to identify you as a unique individual. However, there is research indicating that the aggregation of these small small things can lead to something powerful even to the extent to identify your social security number. Actually, one of my research goals is to minimize the revelation of use innocuous credentials used as part of access controlling in service consumption scenarios. In other words, the question is "how do I get the service with no or minimal disclosure of credentials yet convincing the service provider?"

Another question I am in search of answers is "how much privacy do I loose by revealing different bits of information in different places in the Internet?". Intuitively, as you reveal more attributes about you, you become easier to identify. How does this relationship vary - is your identifiability proportional to something about your attributes? Some attributes reveal more than others. My next question is about identifying that "something"; "Can we capture this notion in an information theoretic way?"

Computation over encrypted data [Crypto]

The following diagram shows the ideal situation:
The objective is to perform a general computation over encrypted data so that the party that performs the computation learns neither the input values nor the result of the computation (over a finite field). The computation to be performed (e.g. eigenvalue computation, null space computation, Gaussian elimination, etc.) is public (i.e. known to everyone). Theoretically speaking, one can achieve the above objective using a SMC (Secure Multiparty Computation) protocols by evaluating a scrambled Boolean circuit. However, it is not practical.

Two popular practical techniques that we can use:
1. Commutative encryption (Pohlig-Hellman)
2. Homomorphic encryption (Paillier, Damgard, Unpadded RSA, Benaloh, ElGamal, etc.)

Since I am interested in one off computation, IMO, homomorphic encryption is the most suitable here. Computations over finite fields, in general, involves two binary operations (e.g. addition and multiplication). However, all the practical homomorphic crypto systems are homomorphic to only one operation. (E.g.: addition - Paillier, Damgard, Benaloh; multiplication - Unpadded RSA, Elgamal). It should be noted that mid last year, IBM published a paper on a fully homomorphic encryption using ideal lattices, but it is computationally intensive and thus not suitable for real applications. So, it is still an open problem to invent a practical fully homomorphic encryption. Until such an invention, we need to rely on specialized protocols to solve the afore mentioned problem.

Sir Ken Robinson: Do schools/universities kill creativity?

Very valid points!

The points that made me think most were the facts that our education system stigmatizes mistakes and schools/universities are like factories that produce people to work in the industry.

Monday, March 15, 2010

Monday, March 8, 2010

TBL on linked data

The talk is about one year old, but still interesting and current. This year's ICDE conference also had some interesting papers on topics related one way or the other to linked data.

Slides of my talk at ICDE 2010

Last week, we had the ICDE 2010 conference in Long Beach, LA. Here are the slides of my talk.